Standard Markov chain Monte Carlo (MCMC) admits three fundamental control parameters: the number of chains, the length of the warmup phase, and the length of the sampling phase. These control parameters play a large role in determining the amount of computation we deploy. In practice, we need to walk a line between achieving sufficient precision and not wasting precious computational resources and time. We review general strategies to check the length of the warmup and sampling phases, and examine the three control parameters of MCMC in the contexts of CPU- and GPU-based hardware. Our discussion centers around three tasks: (1) inference about a latent variable, (2) computation of expectation values and quantiles, and (3) diagnostics to check the reliability of the estimators. This chapter begins with general recommendations on the control parameters of MCMC, which have been battle-tested over the years and often motivate defaults in Bayesian statistical software. Usually we do not know ahead of time how a sampler will interact with a target distribution, and so the choice of MCMC algorithm and its control parameters, tend to be based on experience, re-evaluated after simulations have been obtained and analyzed. The second part of this chapter provides a theoretical motivation for our recommended approach, with pointers to some concerns and open problems. We also examine recent developments on the algorithmic and hardware fronts, which motivate new computational approaches to MCMC.
翻译:标准马尔可夫链蒙特卡洛(MCMC)方法涉及三个基本控制参数:链的数量、预热阶段的长度以及采样阶段的长度。这些控制参数在决定计算资源投入方面起着关键作用。实践中,我们需要在确保充分精度与避免浪费宝贵计算资源及时间之间取得平衡。本文回顾了用于检验预热和采样阶段长度的通用策略,并在基于CPU和GPU的硬件场景下分析了MCMC的三个控制参数。讨论围绕三个任务展开:(1)潜在变量的推断;(2)期望值与分位数的计算;(3)评估估计量可靠性的诊断方法。本章首先提出关于MCMC控制参数的通用建议,这些建议经过多年实践检验,常被用作贝叶斯统计软件中的默认设置。由于我们通常无法预先知晓采样器与目标分布的交互方式,MCMC算法及其控制参数的选择往往基于经验,并在获取和模拟分析结果后重新评估。本章第二部分为推荐方法提供了理论依据,并指出相关注意事项和未解决问题。此外,我们还将探讨算法与硬件领域的最新进展,这些进展推动了MCMC新型计算方法的提出。