We derive Gaussian approximation bounds for random forest predictions based on a set of training points given by a Poisson process, under fairly mild regularity assumptions on the data generating process. Our approach is based on the key observation that the random forest predictions satisfy a certain geometric property called region-based stabilization. In the process of developing our results for the random forest, we also establish a probabilistic result, which might be of independent interest, on multivariate Gaussian approximation bounds for general functionals of Poisson process that are region-based stabilizing. This general result makes use of the Malliavin-Stein method, and is potentially applicable to various related statistical problems.
翻译:我们针对由泊松过程生成的训练点集上的随机森林预测,在数据生成过程的相当温和的正则性假设下,推导出了高斯近似界。该方法基于一个关键观察:随机森林预测满足称为区域稳定性的特定几何性质。在推导随机森林结果的过程中,我们还建立了一个关于泊松过程一般泛函(具有区域稳定性)的多变量高斯近似界的概率结果,该结果可能具有独立的研究价值。这一通用结论应用了Malliavin-Stein方法,并有望适用于各类相关统计问题。