We present a novel prior for tree topology within Bayesian Additive Regression Trees (BART) models. This approach quantifies the hypothetical loss in information and the loss due to complexity associated with choosing the wrong tree structure. The resulting prior distribution is compellingly geared toward sparsity, a critical feature considering BART models' tendency to overfit. Our method incorporates prior knowledge into the distribution via two parameters that govern the tree's depth and balance between its left and right branches. Additionally, we propose a default calibration for these parameters, offering an objective version of the prior. We demonstrate our method's efficacy on both simulated and real datasets.
翻译:我们提出了一种针对贝叶斯加性回归树(BART)模型中树拓扑结构的新型先验。该方法量化了因选择错误树结构而导致的假设性信息损失以及复杂度损失。由此产生的先验分布具有引人注目的稀疏性导向特性,这对于应对BART模型易过拟合的关键问题尤为重要。我们的方法通过两个参数将先验知识融入分布中,这两个参数分别控制树的深度及其左右分支间的平衡性。此外,我们提出了这些参数的默认校准方法,从而提供了该先验的客观版本。我们在模拟数据集和真实数据集上验证了该方法的效果。