The Distributional Random Forest (DRF) is a recently introduced Random Forest algorithm to estimate multivariate conditional distributions. Due to its general estimation procedure, it can be employed to estimate a wide range of targets such as conditional average treatment effects, conditional quantiles, and conditional correlations. However, only results about the consistency and convergence rate of the DRF prediction are available so far. We characterize the asymptotic distribution of DRF and develop a bootstrap approximation of it. This allows us to derive inferential tools for quantifying standard errors and the construction of confidence regions that have asymptotic coverage guarantees. In simulation studies, we empirically validate the developed theory for inference of low-dimensional targets and for testing distributional differences between two populations.
翻译:分布随机森林(DRF)是近期提出的一种用于估计多元条件分布的随机森林算法。由于其通用的估计框架,该方法可广泛适用于条件平均处理效应、条件分位数及条件相关性等目标变量的估计。然而,目前仅存在关于DRF预测一致性和收敛速率的相关结论。本文刻画了DRF的渐近分布特征,并开发了其自举逼近方法。基于此,我们推导出量化标准误差的推断工具,并构建了具有渐近覆盖保证的置信域。在模拟研究中,我们通过低维目标推断与两总体分布差异检验,实证验证了所提出理论的可靠性。