Variational inference (VI) seeks to approximate a target distribution $\pi$ by an element of a tractable family of distributions. Of key interest in statistics and machine learning is Gaussian VI, which approximates $\pi$ by minimizing the Kullback-Leibler (KL) divergence to $\pi$ over the space of Gaussians. In this work, we develop the (Stochastic) Forward-Backward Gaussian Variational Inference (FB-GVI) algorithm to solve Gaussian VI. Our approach exploits the composite structure of the KL divergence, which can be written as the sum of a smooth term (the potential) and a non-smooth term (the entropy) over the Bures-Wasserstein (BW) space of Gaussians endowed with the Wasserstein distance. For our proposed algorithm, we obtain state-of-the-art convergence guarantees when $\pi$ is log-smooth and log-concave, as well as the first convergence guarantees to first-order stationary solutions when $\pi$ is only log-smooth.
翻译:变分推断(VI)旨在通过一个易于处理的分布族中的元素来近似目标分布 $\pi$。统计学与机器学习中的一个关键方向是高斯变分推断,它通过在高斯分布空间中最小化与 $\pi$ 的Kullback-Leibler (KL)散度来逼近 $\pi$。本文提出(随机)前向-后向高斯变分推断(FB-GVI)算法以求解高斯VI问题。该方法利用KL散度的复合结构——该散度可表示为在赋予Wasserstein距离的Bures-Wasserstein (BW)高斯空间上,光滑项(势能)与非光滑项(熵)之和。对于所提出的算法,我们在 $\pi$ 满足对数光滑且对数凹条件时获得了当前最优的收敛保证,并在 $\pi$ 仅满足对数光滑时首次给出了达到一阶稳定解的收敛性证明。