Decision Tree Learning is a fundamental problem for Interpretable Machine Learning, yet it poses a formidable optimization challenge. Despite numerous efforts dating back to the early 1990's, practical algorithms have only recently emerged, primarily leveraging Dynamic Programming (DP) and Branch & Bound (B&B) techniques. These breakthroughs led to the development of two distinct approaches. Algorithms like DL8.5 and MurTree operate on the space of nodes (or branches), they are very fast, but do not penalise complex Decision Trees, i.e. they do not solve for sparsity. On the other hand, algorithms like OSDT and GOSDT operate on the space of Decision Trees, they solve for sparsity but at the detriment of speed. In this work, we introduce Branches, a novel algorithm that integrates the strengths of both paradigms. Leveraging DP and B&B, Branches achieves exceptional speed while also solving for sparsity. Central to its efficiency is a novel analytical bound enabling substantial pruning of the search space. Furthermore, Branches does not necessitate binary features. Theoretical analysis demonstrates that Branches has a lower complexity bound compared to state-of-the-art methods, a claim validated through extensive empirical evaluation. Our results illustrate that Branches outperforms the state of the art in terms of speed and number of iterations while consistently yielding optimal Decision Trees.
翻译:决策树学习是可解释机器学习中的一个基础问题,但其提出了严峻的优化挑战。尽管自20世纪90年代初以来已有诸多努力,实用的算法直到最近才出现,主要利用了动态规划(DP)和分支定界(B&B)技术。这些突破催生了两种不同的方法。诸如DL8.5和MurTree等算法在节点(或分支)空间上操作,它们速度非常快,但不对复杂决策树进行惩罚,即不解决稀疏性问题。另一方面,像OSDT和GOSDT这样的算法在决策树空间上操作,它们解决了稀疏性问题,但牺牲了速度。在本工作中,我们引入了Branches,这是一种融合了两种范式优点的新型算法。通过利用DP和B&B,Branches在解决稀疏性问题的同时实现了卓越的速度。其效率的核心在于一种新颖的分析界限,能够对搜索空间进行大幅剪枝。此外,Branches不需要二元特征。理论分析表明,与最先进的方法相比,Branches具有更低的复杂度下界,这一主张通过广泛的实证评估得到了验证。我们的结果表明,Branches在速度和迭代次数方面优于现有技术,同时始终能产生最优决策树。