Penalized Overdamped and Underdamped Langevin Monte Carlo Algorithms for Constrained Sampling

We consider the constrained sampling problem where the goal is to sample from a target distribution $\pi(x)\propto e^{-f(x)}$ when $x$ is constrained to lie on a convex body $\mathcal{C}$. Motivated by penalty methods from continuous optimization, we propose penalized Langevin Dynamics (PLD) and penalized underdamped Langevin Monte Carlo (PULMC) methods that convert the constrained sampling problem into an unconstrained sampling problem by introducing a penalty function for constraint violations. When $f$ is smooth and gradients are available, we get $\tilde{\mathcal{O}}(d/\varepsilon^{10})$ iteration complexity for PLD to sample the target up to an $\varepsilon$-error where the error is measured in the TV distance and $\tilde{\mathcal{O}}(\cdot)$ hides logarithmic factors. For PULMC, we improve the result to $\tilde{\mathcal{O}}(\sqrt{d}/\varepsilon^{7})$ when the Hessian of $f$ is Lipschitz and the boundary of $\mathcal{C}$ is sufficiently smooth. To our knowledge, these are the first convergence results for underdamped Langevin Monte Carlo methods in the constrained sampling that handle non-convex $f$ and provide guarantees with the best dimension dependency among existing methods with deterministic gradient. If unbiased stochastic estimates of the gradient of $f$ are available, we propose PSGLD and PSGULMC methods that can handle stochastic gradients and are scaleable to large datasets without requiring Metropolis-Hasting correction steps. For PSGLD and PSGULMC, when $f$ is strongly convex and smooth, we obtain $\tilde{\mathcal{O}}(d/\varepsilon^{18})$ and $\tilde{\mathcal{O}}(d\sqrt{d}/\varepsilon^{39})$ iteration complexity in W2 distance. When $f$ is smooth and can be non-convex, we provide finite-time performance bounds and iteration complexity results. Finally, we illustrate the performance on Bayesian LASSO regression and Bayesian constrained deep learning problems.

翻译：我们研究约束采样问题，目标是从目标分布 $\pi(x)\propto e^{-f(x)}$ 中采样，其中 $x$ 被限制在凸集 $\mathcal{C}$ 上。受连续优化中惩罚方法的启发，我们提出惩罚朗之万动力学（PLD）和惩罚欠阻尼朗之万蒙特卡洛（PULMC）方法，通过引入约束违反的惩罚函数将约束采样问题转化为无约束采样问题。当 $f$ 光滑且梯度可用时，PLD 达到 $\tilde{\mathcal{O}}(d/\varepsilon^{10})$ 的迭代复杂度，即可在 TV 距离下以 $\varepsilon$ 误差采样目标分布，其中 $\tilde{\mathcal{O}}(\cdot)$ 隐藏对数因子。对于 PULMC，当 $f$ 的海森矩阵 Lipschitz 连续且 $\mathcal{C}$ 的边界足够光滑时，我们将结果改进至 $\tilde{\mathcal{O}}(\sqrt{d}/\varepsilon^{7})$。据我们所知，这是首次针对约束采样中处理非凸 $f$ 的欠阻尼朗之万蒙特卡洛方法给出的收敛性结果，且在现有确定性梯度方法中具有最优的维度依赖关系。若可获得 $f$ 梯度的无偏随机估计，我们提出 PSGLD 和 PSGULMC 方法，可处理随机梯度并适用于大规模数据集，无需 Metropolis-Hasting 校正步骤。对于 PSGLD 和 PSGULMC，当 $f$ 强凸且光滑时，我们在 W2 距离下得到 $\tilde{\mathcal{O}}(d/\varepsilon^{18})$ 和 $\tilde{\mathcal{O}}(d\sqrt{d}/\varepsilon^{39})$ 的迭代复杂度。当 $f$ 光滑且可能非凸时，我们给出有限时间性能界与迭代复杂度结果。最后，我们通过贝叶斯 LASSO 回归和贝叶斯约束深度学习问题验证方法性能。