Global optimization of decision trees has shown to be promising in terms of accuracy, size, and consequently human comprehensibility. However, many of the methods used rely on general-purpose solvers for which scalability remains an issue. Dynamic programming methods have been shown to scale much better because they exploit the tree structure by solving subtrees as independent subproblems. However, this only works when an objective can be optimized separately for subtrees. We explore this relationship in detail and show necessary and sufficient conditions for such separability and generalize previous dynamic programming approaches into a framework that can optimize any combination of separable objectives and constraints. Experiments on five application domains show the general applicability of this framework, while outperforming the scalability of general-purpose solvers by a large margin.
翻译:决策树的全局优化在精度、规模以及由此带来的人类可理解性方面已显示出良好前景。然而,许多现有方法依赖于通用求解器,其可扩展性仍存在问题。动态规划方法通过将子树作为独立子问题进行求解来利用树结构,从而展现出更优的可扩展性。但这种方法仅在目标函数可针对子树分别优化时有效。我们详细探讨了这种关系,给出了此类可分性的充分必要条件,并将先前的动态规划方法推广到一个可优化任意可分目标与约束组合的框架中。在五个应用域上的实验表明,该框架具有广泛适用性,同时在可扩展性上大幅超越通用求解器。