Approximate Bayesian Computation (ABC) is a popular inference method when likelihoods are hard to come by. Practical bottlenecks of ABC applications include selecting statistics that summarize the data without losing too much information or introducing uncertainty, and choosing distance functions and tolerance thresholds that balance accuracy and computational efficiency. Recent studies have shown that ABC methods using random forest (RF) methodology perform well while circumventing many of ABC's drawbacks. However, RF construction is computationally expensive for large numbers of trees and model simulations, and there can be high uncertainty in the posterior if the prior distribution is uninformative. Here we further adapt random forests to the ABC setting in two ways. The first exploits distributional random forests to provide a direct method for inferring the joint posterior distribution of parameters of interest, while the second describes a sequential Monte Carlo approach which updates the prior distribution iteratively to focus on the most likely regions in the parameter space. We show that the new methods can accurately infer posterior distributions for a wide range of deterministic and stochastic models in different scientific areas.
翻译:近似贝叶斯计算(ABC)是一种在似然函数难以获取时常用的推断方法。ABC应用的实际瓶颈包括:选择既能充分概括数据又不会丢失过多信息或引入不确定性的统计量,以及选取能平衡精度与计算效率的距离函数和容差阈值。近期研究表明,采用随机森林(RF)方法的ABC技术表现优异,同时规避了ABC的诸多缺陷。然而,对于大量决策树和模型模拟的场景,RF构建过程计算成本高昂,且若先验分布信息量不足,后验分布可能存在高度不确定性。本文通过两种方式进一步将随机森林适配于ABC框架:其一利用分布随机森林为关注参数的联合后验分布推断提供直接方法;其二提出一种序贯蒙特卡罗方法,通过迭代更新先验分布以聚焦参数空间中最可能的区域。我们证明,新方法能够准确推断不同科学领域中多种确定性与随机模型的参数后验分布。