The Classification Tree (CT) is one of the most common models in interpretable machine learning. Although such models are usually built with greedy strategies, in recent years, thanks to remarkable advances in Mixer-Integer Programming (MIP) solvers, several exact formulations of the learning problem have been developed. In this paper, we argue that some of the most relevant ones among these training models can be encapsulated within a general framework, whose instances are shaped by the specification of loss functions and regularizers. Next, we introduce a novel realization of this framework: specifically, we consider the logistic loss, handled in the MIP setting by a linear piece-wise approximation, and couple it with $\ell_1$-regularization terms. The resulting Optimal Logistic Tree model numerically proves to be able to induce trees with enhanced interpretability features and competitive generalization capabilities, compared to the state-of-the-art MIP-based approaches.
翻译:分类树是可解释机器学习中最常见的模型之一。尽管这类模型通常采用贪心策略构建,但近年来,得益于混合整数规划求解器的显著进步,研究人员已开发出多种学习问题的精确形式化方法。本文论证了其中一些最相关的训练模型可被纳入一个通用框架,该框架的实例由损失函数和正则化项的规范所塑造。随后,我们提出了该框架的一个新实现:具体而言,我们采用逻辑斯蒂损失(通过线性分段近似在MIP设定中处理),并将其与$\ell_1$正则化项相结合。与基于MIP的最新方法相比,由此产生的最优逻辑斯蒂树模型在数值上证明能够诱导出具有更强可解释性特征和竞争性泛化能力的树。