We develop a semiparametric framework for inference on the mean response in missing-data settings using a corrected posterior distribution. Our approach is tailored to Bayesian Additive Regression Trees (BART), which is a powerful predictive method but whose nonsmoothness complicate asymptotic theory with multi-dimensional covariates. When using BART combined with Bayesian bootstrap weights, we establish a new Bernstein-von Mises theorem and show that the limit distribution generally contains a bias term. To address this, we introduce RoBART, a posterior bias-correction that robustifies BART for valid inference on the mean response. Monte Carlo studies support our theory, demonstrating reduced bias and improved coverage relative to existing procedures using BART.
翻译:本文发展了一种基于修正后验分布的半参数框架,用于缺失数据场景中均值响应的统计推断。该方法专门针对贝叶斯加性回归树(BART)设计——BART虽具有强大的预测能力,但其非光滑特性使得多维协变量下的渐近理论分析变得复杂。通过将BART与贝叶斯自助法权重相结合,我们建立了新的Bernstein-von Mises定理,并证明其极限分布通常包含偏差项。为此,我们提出RoBART方法,通过后验偏差校正使BART能够稳健地进行均值响应的有效推断。蒙特卡洛模拟研究验证了理论结果,表明相较于现有基于BART的推断方法,新方法能有效降低偏差并提升覆盖概率。