Preconditioning is a common method applied to modify Markov chain Monte Carlo algorithms with the goal of making them more efficient. In practice it is often extremely effective, even when the preconditioner is learned from the chain. We analyse and compare the finite-time computational costs of schemes which learn a preconditioner based on the target covariance or the expected Hessian of the target potential with that of a corresponding scheme that does not use preconditioning. We apply our results to the Unadjusted Langevin Algorithm (ULA) for an appropriately regular target, establishing non-asymptotic guarantees for preconditioned ULA which learns its preconditioner. Our results are also applied to the unadjusted underdamped Langevin algorithm in the supplementary material. To do so, we establish non-asymptotic guarantees on the time taken to collect $N$ approximately independent samples from the target for schemes that learn their preconditioners under the assumption that the underlying Markov chain satisfies a contraction condition in the Wasserstein-2 distance. This approximate independence condition, that we formalize, allows us to bridge the non-asymptotic bounds of modern MCMC theory and classical heuristics of effective sample size and mixing time, and is needed to amortise the costs of learning a preconditioner across the many samples it will be used to produce.
翻译:预条件处理是改进马尔可夫链蒙特卡洛算法以提升其效率的常用方法。在实践中,即使预条件子是从链中学习得到的,该方法通常也极为有效。我们分析并比较了基于目标协方差或目标势函数期望海森矩阵学习预条件子的方案与不使用预条件处理的对应方案的有限时间计算成本。我们将结果应用于针对适当正则化目标的未调整朗之万算法,为学习其预条件子的预条件化ULA建立了非渐近保证。补充材料中还将我们的结果应用于未调整欠阻尼朗之万算法。为此,我们在底层马尔可夫链满足Wasserstein-2距离收缩条件的假设下,为学习预条件子的方案建立了从目标分布收集$N$个近似独立样本所需时间的非渐近保证。我们形式化的这种近似独立性条件,使得现代MCMC理论的非渐近界与有效样本量、混合时间等经典启发式方法得以衔接,并且能够将学习预条件子的成本分摊到其将用于生成的大量样本中。