The study of dependence between random variables under external influences is a challenging problem in multivariate analysis. We address this by proposing a novel semi-parametric approach for conditional copula models using Bayesian additive regression trees (BART) models. BART is becoming a popular approach in statistical modelling due to its simple ensemble type formulation complemented by its ability to provide inferential insights. Although BART allows us to model complex functional relationships, it tends to suffer from overfitting. In this article, we exploit a loss-based prior for the tree topology that is designed to reduce the tree complexity. In addition, we propose a novel adaptive Reversible Jump Markov Chain Monte Carlo algorithm that is ergodic in nature and requires very few assumptions allowing us to model complex and non-smooth likelihood functions with ease. Moreover, we show that our method can efficiently recover the true tree structure and approximate a complex conditional copula parameter, and that our adaptive routine can explore the true likelihood region under a sub-optimal proposal variance. Lastly, we provide case studies concerning the effect of gross domestic product on the dependence between the life expectancies and literacy rates of the male and female populations of different countries.
翻译:研究外部影响下随机变量间的相依性是多元分析中的一个具有挑战性的问题。我们通过提出一种新颖的半参数方法来解决此问题,该方法利用贝叶斯加性回归树模型构建条件Copula模型。BART因其简单的集成型公式化方式,并兼具提供推断性见解的能力,正成为统计建模中一种流行的方法。尽管BART允许我们对复杂的函数关系进行建模,但它往往容易出现过拟合。在本文中,我们利用一种基于损失函数的树拓扑先验,该先验旨在降低树的复杂性。此外,我们提出了一种新颖的自适应可逆跳转马尔可夫链蒙特卡洛算法,该算法本质上是遍历的,并且需要很少的假设,使我们能够轻松地对复杂且非平滑的似然函数进行建模。此外,我们证明了我们的方法能够有效地恢复真实的树结构并逼近复杂的条件Copula参数,并且我们的自适应程序能够在次优的提议方差下探索真实的似然区域。最后,我们提供了关于国内生产总值对不同国家男性和女性人口的预期寿命与识字率之间相依性影响的案例研究。