We study various types of consistency of honest decision trees and random forests in the regression setting. In contrast to related literature, our proofs are elementary and follow the classical arguments used for smoothing methods. Under mild regularity conditions on the regression function and data distribution, we establish weak and almost sure convergence of honest trees and honest forest averages to the true regression function, and moreover we obtain uniform convergence over compact covariate domains. The framework naturally accommodates ensemble variants based on subsampling and also a two-stage bootstrap sampling scheme. Our treatment synthesizes and simplifies existing analyses, in particular recovering several results as special cases. The elementary nature of the arguments clarifies the close relationship between data-adaptive partitioning and kernel-type methods, providing an accessible approach to understanding the asymptotic behavior of tree-based methods.
翻译:本文研究了回归场景下诚实决策树与随机森林的多种一致性类型。与相关文献不同,我们的证明过程基于经典平滑方法所采用的基本论证思路,具有初等性。在回归函数与数据分布满足温和正则性条件下,我们证明了诚实树与诚实森林平均量的弱收敛及几乎必然收敛于真实回归函数,并进一步获得了紧协变量域上的一致收敛性。该框架自然地兼容基于子采样的集成变体以及两阶段自助抽样方案。我们的处理方式综合并简化了现有分析,特别是将若干已有结论作为特例进行复原。论证的初等性揭示了数据自适应划分与核型方法之间的紧密联系,为理解树型方法的渐近行为提供了易于掌握的途径。