When does Metropolized Hamiltonian Monte Carlo provably outperform Metropolis-adjusted Langevin algorithm?

We analyze the mixing time of Metropolized Hamiltonian Monte Carlo (HMC) with the leapfrog integrator to sample from a distribution on $\mathbb{R}^d$ whose log-density is smooth, has Lipschitz Hessian in Frobenius norm and satisfies isoperimetry. We bound the gradient complexity to reach $\epsilon$ error in total variation distance from a warm start by $\tilde O(d^{1/4}\text{polylog}(1/\epsilon))$ and demonstrate the benefit of choosing the number of leapfrog steps to be larger than 1. To surpass previous analysis on Metropolis-adjusted Langevin algorithm (MALA) that has $\tilde{O}(d^{1/2}\text{polylog}(1/\epsilon))$ dimension dependency in Wu et al. (2022), we reveal a key feature in our proof that the joint distribution of the location and velocity variables of the discretization of the continuous HMC dynamics stays approximately invariant. This key feature, when shown via induction over the number of leapfrog steps, enables us to obtain estimates on moments of various quantities that appear in the acceptance rate control of Metropolized HMC. Moreover, to deal with another bottleneck on the HMC proposal distribution overlap control in the literature, we provide a new approach to upper bound the Kullback-Leibler divergence between push-forwards of the Gaussian distribution through HMC dynamics initialized at two different points. Notably, our analysis does not require log-concavity or independence of the marginals, and only relies on an isoperimetric inequality. To illustrate the applicability of our result, several examples of natural functions that fall into our framework are discussed.

翻译：我们分析了采用蛙跳积分器的Metropolis化哈密顿蒙特卡洛方法（HMC）的混合时间，该方法用于从$\mathbb{R}^d$上的分布中采样，该分布的对数密度光滑、具有Frobenius范数下Lipschitz连续的Hessian矩阵且满足等周不等式。我们从热启动出发，将在总变差距离中达到$\epsilon$误差所需的梯度复杂度上界界定为$\tilde O(d^{1/4}\text{polylog}(1/\epsilon))$，并证明了选择大于1的蛙跳步数所带来的优势。为了超越Wu等人（2022）关于Metropolis调整的朗之万算法（MALA）的分析（其维度依赖关系为$\tilde{O}(d^{1/2}\text{polylog}(1/\epsilon))$），我们在证明中揭示了一个关键特征：连续HMC动力学离散化过程中的位置和速度变量的联合分布近似保持不变。这一关键特征通过关于蛙跳步数的归纳得以展示，使我们能够对Metropolis化HMC接受率控制中出现的各种量的矩进行估计。此外，为解决文献中HMC提议分布重叠控制的另一个瓶颈，我们提供了一种新方法，用于上界化通过从两个不同点初始化的HMC动力学推动的高斯分布之间的Kullback-Leibler散度。值得注意的是，我们的分析不需要对数凹性或边际独立性，仅依赖于等周不等式。为说明我们结果的适用性，本文讨论了若干属于该框架的自然函数示例。