In the general framework of Bayesian inference, the target distribution can only be evaluated up-to a constant of proportionality. Classical consistent Bayesian methods such as sequential Monte Carlo (SMC) and Markov chain Monte Carlo (MCMC) have unbounded time complexity requirements. We develop a fully parallel sequential Monte Carlo (pSMC) method which provably delivers parallel strong scaling, i.e. the time complexity (and per-node memory) remains bounded if the number of asynchronous processes is allowed to grow. More precisely, the pSMC has a theoretical convergence rate of MSE$ = O(1/NR)$, where $N$ denotes the number of communicating samples in each processor and $R$ denotes the number of processors. In particular, for suitably-large problem-dependent $N$, as $R \rightarrow \infty$ the method converges to infinitesimal accuracy MSE$=O(\varepsilon^2)$ with a fixed finite time-complexity Cost$=O(1)$ and with no efficiency leakage, i.e. computational complexity Cost$=O(\varepsilon^{-2})$. A number of Bayesian inference problems are taken into consideration to compare the pSMC and MCMC methods.
翻译:摘要:在贝叶斯推断的通用框架中,目标分布仅能评估至比例常数。经典一致性贝叶斯方法,如序列蒙特卡洛(SMC)和马尔可夫链蒙特卡洛(MCMC),具有无界的时间复杂度需求。我们开发了一种全并行序列蒙特卡洛(pSMC)方法,该方法可证明实现并行强扩展,即若允许异步进程数量增长,其时间复杂度(及每节点内存)保持有界。更精确地说,pSMC 的理论收敛速率为 MSE$ = O(1/NR)$,其中 $N$ 表示每个处理器中通信样本的数量,$R$ 表示处理器数量。特别地,对于适合问题规模的 $N$,当 $R \rightarrow \infty$ 时,该方法以固定有限时间复杂度 Cost$=O(1)$ 收敛至无穷小精度 MSE$=O(\varepsilon^2)$,且无效率泄漏,即计算复杂度 Cost$=O(\varepsilon^{-2})$。我们考虑多个贝叶斯推断问题以比较 pSMC 与 MCMC 方法。