We study linear preconditioning in Markov chain Monte Carlo. We consider the class of well-conditioned distributions, for which several mixing time bounds depend on the condition number $\kappa$. First we show that well-conditioned distributions exist for which $\kappa$ can be arbitrarily large and yet no linear preconditioner can reduce it. We then impose two sets of extra assumptions under which a linear preconditioner can significantly reduce $\kappa$. For the random walk Metropolis we further provide upper and lower bounds on the spectral gap with tight $1/\kappa$ dependence. This allows us to give conditions under which linear preconditioning can provably increase the gap. We then study popular preconditioners such as the covariance, its diagonal approximation, the hessian at the mode, and the QR decomposition. We show conditions under which each of these reduce $\kappa$ to near its minimum. We also show that the diagonal approach can in fact \textit{increase} the condition number. This is of interest as diagonal preconditioning is the default choice in well-known software packages. We conclude with a numerical study comparing preconditioners in different models, and showing how proper preconditioning can greatly reduce compute time in Hamiltonian Monte Carlo.
翻译:我们研究马尔可夫链蒙特卡洛中的线性预处理。我们考虑一类良态分布,其若干混合时间边界依赖于条件数 $\kappa$。首先我们证明,存在良态分布使得 $\kappa$ 可以任意大,但任何线性预处理都无法降低它。随后,我们施加两组额外假设,在这些假设下线性预处理可以显著降低 $\kappa$。对于随机游走Metropolis算法,我们进一步给出了谱隙的上界与下界,其具有紧致的 $1/\kappa$ 依赖关系。这使我们能够给出线性预处理可证明增加谱隙的条件。接着,我们研究常用的预处理方法,如协方差矩阵、其对角近似、众数处的海森矩阵以及QR分解。我们展示了这些方法各自将 $\kappa$ 降至接近其最小值的条件。我们还证明,对角预处理方法实际上可能 \textit{增大} 条件数。这一点值得关注,因为对角预处理是知名软件包中的默认选择。最后,我们通过数值研究比较了不同模型中的预处理方法,并展示了恰当的预处理如何能极大减少哈密顿蒙特卡洛的计算时间。