Bayesian additive regression trees (BART) is a non-parametric method to approximate functions. It is a black-box method based on the sum of many trees where priors are used to regularize inference, mainly by restricting trees' learning capacity so that no individual tree is able to explain the data, but rather the sum of trees. We discuss BART in the context of probabilistic programming languages (PPL), i.e., we present BART as a primitive that can be used as a component of a probabilistic model rather than as a standalone model. Specifically, we introduce the Python library PyMC-BART, which works by extending PyMC, a library for probabilistic programming. We showcase a few examples of models that can be built using PyMC-BART, discuss recommendations for the selection of hyperparameters, and finally, we close with limitations of our implementation and future directions for improvement.
翻译:贝叶斯加法回归树(BART)是一种用于逼近函数的非参数方法。它是一种基于多棵树求和的黑箱方法,通过先验约束推断过程,主要限制每棵树的独立学习能力,使其无法单独解释数据,而必须依赖多棵树联合解释。本文在概率编程语言(PPL)背景下讨论BART,即将BART作为概率模型的组件而非独立模型呈现。具体而言,我们介绍了Python库PyMC-BART,该库通过扩展概率编程库PyMC实现。我们展示了几种可利用PyMC-BART构建的模型示例,讨论了超参数选择建议,最后指出了当前实现的局限性及未来改进方向。