Global optimization of decision trees has shown to be promising in terms of accuracy, size, and consequently human comprehensibility. However, many of the methods used rely on general-purpose solvers for which scalability remains an issue. Dynamic programming methods have been shown to scale much better because they exploit the tree structure by solving subtrees as independent subproblems. However, this only works when an objective can be optimized separately for subtrees. We explore this relationship in detail and show necessary and sufficient conditions for such separability and generalize previous dynamic programming approaches into a framework that can optimize any combination of separable objectives and constraints. Experiments on four application domains show the general applicability of this framework, while outperforming the scalability of general-purpose solvers by a large margin.
翻译:决策树的全局优化在准确性、规模以及由此带来的人类可理解性方面已展现出巨大潜力。然而,许多现有方法依赖通用求解器,其可扩展性仍存问题。动态规划方法通过利用树结构将子树作为独立子问题求解,从而实现了更好的可扩展性。但这仅适用于目标函数可对子树分别优化的情形。我们详细探究了这种关联性,证明了此类可分离性的必要充分条件,并将先前的动态规划方法推广为一个能够优化任意可分离目标与约束组合的框架。在四个应用领域上的实验表明,该框架具有广泛适用性,同时在可扩展性上显著优于通用求解器。