Gradient-adjusted underdamped Langevin dynamics for sampling

Sampling from a target distribution is a fundamental problem. Traditional Markov chain Monte Carlo (MCMC) algorithms, such as the unadjusted Langevin algorithm (ULA), derived from the overdamped Langevin dynamics, have been extensively studied. From an optimization perspective, the Kolmogorov forward equation of the overdamped Langevin dynamics can be treated as the gradient flow of the relative entropy in the space of probability densities embedded with Wassrstein-2 metrics. Several efforts have also been devoted to including momentum-based methods, such as underdamped Langevin dynamics for faster convergence of sampling algorithms. Recent advances in optimizations have demonstrated the effectiveness of primal-dual damping and Hessian-driven damping dynamics for achieving faster convergence in solving optimization problems. Motivated by these developments, we introduce a class of stochastic differential equations (SDEs) called gradient-adjusted underdamped Langevin dynamics (GAUL), which add stochastic perturbations in primal-dual damping dynamics and Hessian-driven damping dynamics from optimization. We prove that GAUL admits the correct stationary distribution, whose marginal is the target distribution. The proposed method outperforms overdamped and underdamped Langevin dynamics regarding convergence speed in the total variation distance for Gaussian target distributions. Moreover, using the Euler-Maruyama discretization, we show that the mixing time towards a biased target distribution only depends on the square root of the condition number of the target covariance matrix. Numerical experiments for non-Gaussian target distributions, such as Bayesian regression problems and Bayesian neural networks, further illustrate the advantages of our approach.

翻译：从目标分布中采样是一个基础性问题。传统马尔可夫链蒙特卡洛（MCMC）算法，如源于过阻尼朗之万动力学的未调整朗之万算法（ULA），已得到广泛研究。从优化视角看，过阻尼朗之万动力学的Kolmogorov前向方程可视为嵌入Wassrstein-2度规的概率密度空间中相对熵的梯度流。已有诸多研究致力于引入基于动量的方法，例如使用欠阻尼朗之万动力学以加速采样算法的收敛。优化领域的最新进展表明，原始-对偶阻尼和Hessian驱动阻尼动力学在解决优化问题时能有效实现更快的收敛速度。受这些进展启发，我们引入一类称为梯度调整欠阻尼朗之万动力学（GAUL）的随机微分方程（SDE），它在优化的原始-对偶阻尼动力学和Hessian驱动阻尼动力学中加入了随机扰动。我们证明GAUL具有正确的平稳分布，其边际分布即为目标分布。对于高斯目标分布，所提方法在总变差距离的收敛速度方面优于过阻尼和欠阻尼朗之万动力学。此外，利用Euler-Maruyama离散化，我们证明该方法向有偏目标分布的混合时间仅取决于目标协方差矩阵条件数的平方根。针对非高斯目标分布（如贝叶斯回归问题和贝叶斯神经网络）的数值实验进一步说明了我们方法的优势。