Provable Convergence of Variational Monte Carlo Methods

The Variational Monte Carlo (VMC) is a promising approach for computing the ground state energy of many-body quantum problems and attracts more and more interests due to the development of machine learning. The recent paradigms in VMC construct neural networks as trial wave functions, sample quantum configurations using Markov chain Monte Carlo (MCMC) and train neural networks with stochastic gradient descent (SGD) method. However, the theoretical convergence of VMC is still unknown when SGD interacts with MCMC sampling given a well-designed trial wave function. Since MCMC reduces the difficulty of estimating gradients, it has inevitable bias in practice. Moreover, the local energy may be unbounded, which makes it harder to analyze the error of MCMC sampling. Therefore, we assume that the local energy is sub-exponential and use the Bernstein inequality for non-stationary Markov chains to derive error bounds of the MCMC estimator. Consequently, VMC is proven to have a first order convergence rate $O(\log K/\sqrt{n K})$ with $K$ iterations and a sample size $n$. It partially explains how MCMC influences the behavior of SGD. Furthermore, we verify the so-called correlated negative curvature condition and relate it to the zero-variance phenomena in solving eigenvalue functions. It is shown that VMC escapes from saddle points and reaches $(\epsilon,\epsilon^{1/4})$ -approximate second order stationary points or $\epsilon^{1/2}$-variance points in at least $O(\epsilon^{-11/2}\log^{2}(1/\epsilon) )$ steps with high probability. Our analysis enriches the understanding of how VMC converges efficiently and can be applied to general variational methods in physics and statistics.

翻译：变分蒙特卡洛方法（VMC）是一种计算多体量子问题基态能量的有前景方法，随着机器学习的发展，其关注度日益增加。当前VMC研究范式采用神经网络作为试探波函数，通过马尔可夫链蒙特卡洛（MCMC）采样量子构型，并利用随机梯度下降（SGD）方法训练神经网络。然而，当SGD与MCMC采样相互作用时，VMC的理论收敛性在给定设计良好的试探波函数下仍属未知。由于MCMC降低了梯度估计的难度，其在实践中不可避免地存在偏差。此外，局部能量可能无界，这使得MCMC采样的误差分析更加困难。因此，我们假设局部能量服从次指数分布，并利用非平稳马尔可夫链的伯恩斯坦不等式推导MCMC估计器的误差界。由此证明，VMC在迭代次数$K$与样本量$n$下具有一阶收敛率$O(\log K/\sqrt{n K})$，这部分解释了MCMC如何影响SGD的行为。进一步，我们验证了所谓的相关负曲率条件，并将其与求解特征值函数中的零方差现象相关联。研究表明，VMC能以高概率在至少$O(\epsilon^{-11/2}\log^{2}(1/\epsilon))$步内逃离鞍点，并达到$(\epsilon,\epsilon^{1/4})$近似二阶驻点或$\epsilon^{1/2}$方差点。本分析深化了对VMC高效收敛机制的理解，并可推广至物理学和统计学中的一般变分方法。