Decision tree learning is increasingly being used for pointwise inference. Important applications include causal heterogenous treatment effects and dynamic policy decisions, as well as conditional quantile regression and design of experiments, where tree estimation and inference is conducted at specific values of the covariates. In this paper, we call into question the use of decision trees (trained by adaptive recursive partitioning) for such purposes by demonstrating that they can fail to achieve polynomial rates of convergence in uniform norm with non-vanishing probability, even with pruning. Instead, the convergence may be arbitrarily slow or, in some important special cases, such as honest regression trees, fail completely. We show that random forests can remedy the situation, turning poor performing trees into nearly optimal procedures, at the cost of losing interpretability and introducing two additional tuning parameters. The two hallmarks of random forests, subsampling and the random feature selection mechanism, are seen to each distinctively contribute to achieving nearly optimal performance for the model class considered.
翻译:决策树学习日益被用于点态推断。重要应用包括异质性因果处理效应、动态策略决策、条件分位数回归及实验设计,其中树的估计与推断在协变量的特定取值处进行。本文通过证明自适应递归划分训练的决策树(即使经过剪枝)在一致范数下可能以非零概率无法达到多项式收敛速率,质疑其用于此类目的的合理性。实际收敛可能任意缓慢,或在某些重要特例(如诚实回归树)中完全失败。研究表明,随机森林可弥补这一缺陷,将性能欠佳的决策树转化为近最优过程,代价是丧失可解释性并引入两个额外调参参数。随机森林的两大核心机制——子抽样与随机特征选择——均被证实为所考虑模型类实现近最优性能分别提供独特贡献。