The Classification Tree (CT) is one of the most common models in interpretable machine learning. Although such models are usually built with greedy strategies, in recent years, thanks to remarkable advances in Mixer-Integer Programming (MIP) solvers, several exact formulations of the learning problem have been developed. In this paper, we argue that some of the most relevant ones among these training models can be encapsulated within a general framework, whose instances are shaped by the specification of loss functions and regularizers. Next, we introduce a novel realization of this framework: specifically, we consider the logistic loss, handled in the MIP setting by a linear piece-wise approximation, and couple it with $\ell_1$-regularization terms. The resulting Optimal Logistic Tree model numerically proves to be able to induce trees with enhanced interpretability features and competitive generalization capabilities, compared to the state-of-the-art MIP-based approaches.
翻译:分类树是可解释机器学习中最常见的模型之一。尽管这类模型通常采用贪心策略构建,但近年来得益于混合整数规划求解器的显著进步,研究者已开发出多种精确的学习问题形式化表达。本文指出,这些训练模型中若干最具代表性的模型可被归纳为一个通用框架,其具体实例由损失函数与正则化项的设定共同决定。随后,我们提出该框架的一个新实现:具体而言,我们采用逻辑斯蒂损失函数(通过线性分段近似在混合整数规划框架中处理),并将其与 $\ell_1$ 正则化项相结合。数值实验证明,与基于混合整数规划的最新方法相比,由此生成的最优逻辑斯蒂树模型能够诱导出具有更强可解释性特征且泛化能力相当的高质量决策树。