The decision tree is a flexible machine learning model that finds its success in numerous applications. It is usually fitted in a recursively greedy manner using CART. In this paper, we investigate the convergence rate of CART under a regression setting. First, we establish an upper bound on the prediction error of CART under a sufficient impurity decrease (SID) condition \cite{chi2022asymptotic} -- our result improves upon the known result by \cite{chi2022asymptotic} under a similar assumption. Furthermore, we provide examples that demonstrate the error bound cannot be further improved by more than a constant or a logarithmic factor. Second, we introduce a set of easily verifiable sufficient conditions for the SID condition. Specifically, we demonstrate that the SID condition can be satisfied in the case of an additive model, provided that the component functions adhere to a ``locally reverse Poincar{\'e} inequality". We discuss several well-known function classes in non-parametric estimation to illustrate the practical utility of this concept.
翻译:决策树是一种灵活的机器学习模型,在众多应用中取得了成功。它通常通过CART算法以递归贪婪方式进行拟合。本文研究了回归设置下CART算法的收敛速度。首先,我们在充分不纯度降低(SID)条件\cite{chi2022asymptotic}下建立了CART预测误差的上界——这一结果改进了\cite{chi2022asymptotic}在类似假设下的已知结论。此外,我们提供的示例表明,该误差界最多只能通过常数因子或对数因子进行改进。其次,我们引入了一组易于验证的充分条件来保证SID条件成立。具体而言,我们证明在加性模型情况下,若各分量函数满足"局部逆庞加莱不等式",则SID条件得以满足。我们讨论了非参数估计中几个著名的函数类,以说明这一概念的实际应用价值。