Finding an optimal decision tree for a supervised learning task is a challenging combinatorial problem to solve at scale. It was recently proposed to frame the problem as a Markov Decision Problem (MDP) and use deep reinforcement learning to tackle scaling. Unfortunately, these methods are not competitive with the current branch-and-bound state-of-the-art. We propose instead to scale the resolution of such MDPs using an information-theoretic tests generating function that heuristically, and dynamically for every state, limits the set of admissible test actions to a few good candidates. As a solver, we show empirically that our algorithm is at the very least competitive with branch-and-bound alternatives. As a machine learning tool, a key advantage of our approach is to solve for multiple complexity-performance trade-offs at virtually no additional cost. With such a set of solutions, a user can then select the tree that generalizes best and which has the interpretability level that best suits their needs, which no current branch-and-bound method allows.
翻译:在监督学习任务中寻找最优决策树是一个在大规模场景下难以解决的组合优化问题。近期有研究提出将该问题建模为马尔可夫决策问题(MDP),并利用深度强化学习来应对规模挑战。然而,这些方法无法与当前基于分支定界的最先进技术相竞争。我们提出通过一种信息论测试生成函数来扩展此类MDP的求解规模——该函数针对每个状态采用启发式动态策略,将可接受的测试动作集合限制在少量优质候选项中。实验表明,作为求解器,我们的算法至少与分支定界替代方案具有竞争力;作为机器学习工具,本方法的关键优势在于能够以几乎零额外成本同时解决多个复杂度-性能权衡问题。通过获取这样一组解,用户既可选择泛化能力最优的决策树,又能根据需求定制最适合的模型可解释性水平,这是当前所有分支定界方法无法实现的。