We introduce a framework rooted in a rate distortion problem for Markov chains, and show how a suite of commonly used Markov Chain Monte Carlo (MCMC) algorithms are specific instances within it, where the target stationary distribution is controlled by the distortion function. Our approach offers a unified variational view on the optimality of algorithms such as Metropolis-Hastings, Glauber dynamics, the swapping algorithm and Feynman-Kac path models. Along the way, we analyze factorizability and geometry of multivariate Markov chains. Specifically, we demonstrate that induced chains on factors of a product space can be regarded as information projections with respect to a particular divergence. This perspective yields Han--Shearer type inequalities for Markov chains as well as applications in the context of large deviations and mixing time comparison. Finally, to demonstrate the significance of our framework, we propose a new projection sampler based on the swapping algorithm that provably accelerates the mixing time by multiplicative factors related to the number of temperatures and the dimension of the underlying state space.
翻译:我们提出了一个根植于马尔可夫链率失真问题的框架,并展示了常用马尔可夫链蒙特卡洛算法如何作为该框架中的具体实例,其中目标平稳分布由失真函数控制。我们的方法为Metropolis-Hastings算法、Glauber动力学、交换算法和Feynman-Kac路径模型等算法的最优性提供了统一的变分视角。在此过程中,我们分析了多元马尔可夫链的可因子化性质与几何结构。具体而言,我们证明了乘积空间因子上的诱导链可被视为特定散度下的信息投影。这一视角导出了马尔可夫链的Han-Shearer型不等式,并在大偏差与混合时间比较的背景下提供了应用。最后,为展示本框架的重要性,我们基于交换算法提出了一种新的投影采样器,该采样器可证明地加速混合时间,其加速倍数与温度数量及底层状态空间的维度相关。