A Mathematical Programming Approach to Optimal Classification Forests

This paper introduces Weighted Optimal Classification Forests (WOCFs), a new family of classifiers that takes advantage of an optimal ensemble of decision trees to derive accurate and interpretable classifiers. We propose a novel mathematical optimization-based methodology which simultaneously constructs a given number of trees, each of them providing a predicted class for the observations in the feature space. The classification rule is derived by assigning to each observation its most frequently predicted class among the trees. We provide a mixed integer linear programming formulation (MIP) for the problem and several novel MIP strengthening / scaling techniques. We report the results of our computational experiments, from which we conclude that our method has equal or superior performance compared with state-of-the-art tree-based classification methods for small to medium-sized instances. We also present three real-world case studies showing that our methodology has very interesting implications in terms of interpretability. Overall, WOCFs complement existing methods such as CART, Optimal Classification Trees, Random Forests and XGBoost. In addition to its Pareto improvement on accuracy and interpretability, we also see unique properties emerging in terms of different trees focusing on different feature variables. This provides nontrivial improvement in interpretability and usability of the trained model in terms of counterfactual explanation. Thus, despite the apparent computational challenge of WOCFs that limit the size of the problems that can be efficiently solved with current MIP, this is an important research direction that can lead to qualitatively different insights for researchers and complement the toolbox of practitioners for high stakes problems.

翻译：本文提出加权最优分类森林（WOCFs），这是一种利用最优决策树集成构建准确且可解释分类器的新方法。我们提出一种基于数学优化的新方法，能够同时构建给定数量的决策树，每棵树为特征空间中的观测样本提供类别预测。分类规则通过为每个观测样本分配其在所有树中最频繁出现的预测类别而得出。我们为该问题建立了混合整数线性规划（MIP）模型，并提出多种新颖的MIP强化与扩展技术。计算实验结果表明，对于中小规模数据集，本方法相比基于决策树的最先进分类方法具有相当或更优的性能。我们还通过三个实际案例研究证明，该方法在可解释性方面具有重要价值。总体而言，WOCFs是对现有方法（如CART、最优分类树、随机森林和XGBoost）的重要补充。除了在准确性与可解释性方面实现帕累托改进外，该方法还展现出独特性质：不同决策树会侧重不同的特征变量。这在反事实解释方面显著提升了训练模型的可解释性与可用性。尽管WOCFs存在明显的计算挑战，限制了当前MIP技术能高效求解的问题规模，但这仍是一个重要的研究方向，可为研究者提供质的不同视角，并为实践者处理高风险问题补充新的工具。