Global optimization of decision trees has shown to be promising in terms of accuracy, size, and consequently human comprehensibility. However, many of the methods used rely on general-purpose solvers for which scalability remains an issue. Dynamic programming methods have been shown to scale much better because they exploit the tree structure by solving subtrees as independent subproblems. However, this only works when an objective can be optimized separately for subtrees. We explore this relationship in detail and show the necessary and sufficient conditions for such separability and generalize previous dynamic programming approaches into a framework that can optimize any combination of separable objectives and constraints. Experiments on five application domains show the general applicability of this framework, while outperforming the scalability of general-purpose solvers by a large margin.
翻译:全局优化决策树在精度、规模及由此带来的人类可理解性方面已展现出显著潜力。然而,许多现有方法依赖通用求解器,其可扩展性仍是关键瓶颈。动态规划方法通过将子树作为独立子问题求解,利用树结构特性大幅提升了可扩展性——但这仅当目标函数可对子树分别优化时成立。本文深入探究这一关系,揭示了此类可分离性的必要充分条件,并将现有动态规划方法推广为一个能优化任意可分离目标与约束组合的理论框架。在五个应用领域的实验表明,该框架具备广泛适用性,且其可扩展性远超通用求解器。