Faster high-accuracy log-concave sampling via algorithmic warm starts

Understanding the complexity of sampling from a strongly log-concave and log-smooth distribution $\pi$ on $\mathbb{R}^d$ to high accuracy is a fundamental problem, both from a practical and theoretical standpoint. In practice, high-accuracy samplers such as the classical Metropolis-adjusted Langevin algorithm (MALA) remain the de facto gold standard; and in theory, via the proximal sampler reduction, it is understood that such samplers are key for sampling even beyond log-concavity (in particular, for distributions satisfying isoperimetric assumptions). In this work, we improve the dimension dependence of this sampling problem to $\tilde{O}(d^{1/2})$, whereas the previous best result for MALA was $\tilde{O}(d)$. This closes the long line of work on the complexity of MALA, and moreover leads to state-of-the-art guarantees for high-accuracy sampling under strong log-concavity and beyond (thanks to the aforementioned reduction). Our starting point is that the complexity of MALA improves to $\tilde{O}(d^{1/2})$, but only under a warm start (an initialization with constant R\'enyi divergence w.r.t. $\pi$). Previous algorithms took much longer to find a warm start than to use it, and closing this gap has remained an important open problem in the field. Our main technical contribution settles this problem by establishing the first $\tilde{O}(d^{1/2})$ R\'enyi mixing rates for the discretized underdamped Langevin diffusion. For this, we develop new differential-privacy-inspired techniques based on R\'enyi divergences with Orlicz--Wasserstein shifts, which allow us to sidestep longstanding challenges for proving fast convergence of hypocoercive differential equations.

翻译：理解从强对数凹且对数光滑分布π（在ℝ^d上）进行高精度采样的复杂性，是一个基础性问题，兼具实践与理论意义。实践中，经典Metropolis调整Langevin算法（MALA）等高精度采样器仍保持事实上的黄金标准；理论上，通过近端采样器约化，此类采样器对于超越对数凹性的采样（特别是满足等周假设的分布）至关重要。本研究将该采样问题的维度依赖优化至Õ(d^{1/2})，而MALA先前最优结果为Õ(d)。这终结了MALA复杂性的长期研究，并借助前述约化，进一步为强对数凹性及更广泛条件下的高精度采样提供最优保证。我们的出发点是发现：MALA的复杂度可降至Õ(d^{1/2})，但仅当采用热启动（相对于π具有常数Rényi散度的初始化）时成立。此前算法寻找热启动的时间远长于利用热启动的时间，弥合这一差距成为领域长期未决的开放问题。我们的主要技术贡献在于：首次建立离散化欠阻尼Langevin扩散的Õ(d^{1/2}) Rényi混合速率。为此，我们发展了基于Orlicz-Wasserstein移位下Rényi散度的新型差分隐私启发技术，从而规避了证明次驱散微分方程快速收敛的长期挑战。