Soft regression trees: a model variant and a decomposition training algorithm

Decision trees are widely used for classification and regression tasks in a variety of application fields due to their interpretability and good accuracy. During the past decade, growing attention has been devoted to globally optimized decision trees with deterministic or soft splitting rules at branch nodes, which are trained by optimizing the error function over all the tree parameters. In this work, we propose a new variant of soft multivariate regression trees (SRTs) where, for every input vector, the prediction is defined as the linear regression associated to a single leaf node, namely, the leaf node obtained by routing the input vector from the root along the branches with higher probability. SRTs exhibit the conditional computational property, i.e., each prediction depends on a small number of nodes (parameters), and our nonlinear optimization formulation for training them is amenable to decomposition. After showing a universal approximation result for SRTs, we present a decomposition training algorithm including a clustering-based initialization procedure and a heuristic for reassigning the input vectors along the tree. Under mild assumptions, we establish asymptotic convergence guarantees. Experiments on 15 wellknown datasets indicate that our SRTs and decomposition algorithm yield higher accuracy and robustness compared with traditional soft regression trees trained using the nonlinear optimization formulation of Blanquero et al., and a significant reduction in training times as well as a slightly better average accuracy compared with the mixed-integer optimization approach of Bertsimas and Dunn. We also report a comparison with the Random Forest ensemble method.

翻译：决策树因其可解释性和良好准确性，在众多应用领域被广泛用于分类和回归任务。过去十年间，人们对具有确定性或软分裂规则的全局优化决策树日益关注，这类模型通过优化所有树参数上的误差函数进行训练。本文提出一种新的软多元回归树变体，其中对于每个输入向量，预测被定义为与单一叶节点相关联的线性回归，该叶节点通过将输入向量从根节点沿更高概率分支路径路由而获得。软回归树展现出条件计算特性，即每个预测仅依赖于少量节点（参数），且我们提出的非线性优化训练公式适用于分解方法。在证明软回归树的通用逼近性质后，我们提出一种包含基于聚类的初始化过程及沿树重新分配输入向量启发式策略的分解训练算法。在温和假设下，我们建立了渐近收敛保证。在15个经典数据集上的实验表明：相较于采用Blanquero等人非线性优化公式训练的传统软回归树，我们的软回归树及分解算法能获得更高的准确性与鲁棒性；相较于Bertsimas和Dunn的混合整数优化方法，我们的方法在显著减少训练时间的同时，平均准确率亦略有提升。我们还报告了与随机森林集成方法的对比结果。