With the recently increased interest in probabilistic models, the efficiency of an underlying sampler becomes a crucial consideration. A Hamiltonian Monte Carlo (HMC) is one popular option for models of this kind. Performance of HMC, however, strongly relies on a choice of parameters associated with an integration method for Hamiltonian equations, which up to date remains mainly heuristic or introduces time complexity. We propose a novel computationally inexpensive and flexible approach (we call it Adaptive Tuning or ATune) that, by combining a theoretical analysis of the multivariate Gaussian model with simulation data generated during a burn-in stage of HMC, detects a system specific splitting integrator with a set of reliable HMC hyperparameters, including their credible randomization intervals, to be readily used in a production simulation. The method automatically eliminates those values of simulation parameters which could cause undesired extreme scenarios, such as resonance artefacts, low accuracy or poor sampling. The new approach is implemented in the in-house software package HaiCS, with no computational overheads introduced in a production simulation, and can be easily incorporated in any package for Bayesian inference with HMC. The tests on popular statistical models reveal the superiority of adaptively tuned HMC and generalized Hamiltonian Monte Carlo (GHMC) in terms of stability, performance and accuracy over conventional HMC tuned heuristically and coupled with the well-established integrators. We also claim that the generalized formulation of HMC, i.e. GHMC, is preferable for achieving high sampling performance. The efficiency of the new methodology is assessed in comparison with state-of-the-art samplers, e.g. NUTS, in real-world applications, such as endocrine therapy resistance in cancer, modeling of cell-cell adhesion dynamics and influenza A epidemic outbreak.
翻译:随着概率模型近年来越发受到关注,底层采样器的效率成为关键考量因素。哈密顿蒙特卡洛(HMC)方法是此类模型的常用选择之一。然而,HMC的性能高度依赖于哈密顿方程数值积分方法相关参数的选择,而迄今为止该选择主要依赖启发式方法或会引入时间复杂性。我们提出了一种新颖的计算成本低廉且灵活的方法(称为自适应调参或ATune),该方法通过结合多元高斯模型的理论分析与HMC预热阶段生成的模拟数据,能够检测出系统特定的分裂积分器及一组可靠的HMC超参数(包括其可信随机化区间),可直接用于生产模拟。该方法自动排除可能导致不良极端场景(如共振伪影、精度不足或采样效果差)的模拟参数值。新方法已在内部软件包HaiCS中实现,在生产模拟中不会引入额外计算开销,并可轻松集成到任何基于HMC的贝叶斯推断软件包中。在经典统计模型上的测试表明,自适应调参的HMC及广义哈密顿蒙特卡洛(GHMC)在稳定性、性能与精度方面均优于传统启发式调参的HMC与成熟积分器的组合。我们还指出,广义化的HMC(即GHMC)更有利于实现高采样性能。通过与前沿采样器(如NUTS)在真实场景应用中的对比,评估了新方法的效率,应用领域包括癌症内分泌治疗耐药性、细胞间粘附动力学建模以及甲型流感疫情暴发模拟。