With the recently increased interest in probabilistic models, the efficiency of an underlying sampler becomes a crucial consideration. A Hamiltonian Monte Carlo (HMC) sampler is one popular option for models of this kind. Performance of HMC, however, strongly relies on a choice of parameters associated with an integration method for Hamiltonian equations, which up to date remains mainly heuristic or introduce time complexity. We propose a novel computationally inexpensive and flexible approach (we call it Adaptive Tuning or ATune) that, by analyzing the data generated during a burning stage of an HMC simulation, detects a system specific splitting integrator with a set of reliable HMC hyperparameters, including their credible randomization intervals, to be readily used in a production simulation. The method automatically eliminates those values of simulation parameters which could cause undesired extreme scenarios, such as resonance artifacts, low accuracy or poor sampling. The new approach is implemented in the in-house software package \textsf{HaiCS}, with no computational overheads introduced in a production simulation, and can be easily incorporated in any package for Bayesian inference with HMC. The tests on popular statistical models using original HMC and generalized Hamiltonian Monte Carlo (GHMC) reveal the superiority of adaptively tuned methods in terms of stability, performance and accuracy over conventional HMC tuned heuristically and coupled with the well-established integrators. We also claim that the generalized formulation of HMC, i.e. GHMC, is preferable for achieving high sampling performance. The efficiency of the new methodology is assessed in comparison with state-of-the-art samplers, e.g. the No-U-Turn-Sampler (NUTS), in real-world applications, such as endocrine therapy resistance in cancer, modeling of cell-cell adhesion dynamics and influenza epidemic outbreak.
翻译:随着概率模型近年来越来越受到关注,底层采样器的效率成为一个关键考量因素。哈密顿蒙特卡洛(HMC)采样器是此类模型中一种常用的选择。然而,HMC 的性能在很大程度上依赖于哈密顿方程数值积分方法相关参数的选择,迄今为止这些参数的选择主要依赖于启发式方法或会引入时间复杂度的策略。我们提出了一种新颖的计算成本低廉且灵活的方法(我们称之为自适应调参或 ATune),该方法通过分析 HMC 模拟预热阶段生成的数据,检测出系统特定的分裂积分器以及一组可靠的 HMC 超参数(包括其可信的随机化区间),以便直接用于生产模拟。该方法自动剔除那些可能导致不良极端场景(如共振伪影、低精度或采样效果差)的模拟参数值。新方法已在内部软件包 \textsf{HaiCS} 中实现,在生产模拟中不会引入额外的计算开销,并且可以轻松集成到任何使用 HMC 进行贝叶斯推断的软件包中。在流行统计模型上使用原始 HMC 和广义哈密顿蒙特卡洛(GHMC)进行的测试表明,自适应调参方法在稳定性、性能和准确性方面均优于采用启发式调参并结合成熟积分器的传统 HMC。我们还主张,HMC 的广义表述(即 GHMC)对于实现高采样性能更为可取。新方法的效率在癌症内分泌治疗耐药性、细胞-细胞粘附动力学建模和流感疫情爆发等实际应用中,与最先进的采样器(如 No-U-Turn-Sampler (NUTS))进行了比较评估。