Understanding the gradient variance of black-box variational inference (BBVI) is a crucial step for establishing its convergence and developing algorithmic improvements. However, existing studies have yet to show that the gradient variance of BBVI satisfies the conditions used to study the convergence of stochastic gradient descent (SGD), the workhorse of BBVI. In this work, we show that BBVI satisfies a matching bound corresponding to the $ABC$ condition used in the SGD literature when applied to smooth and quadratically-growing log-likelihoods. Our results generalize to nonlinear covariance parameterizations widely used in the practice of BBVI. Furthermore, we show that the variance of the mean-field parameterization has provably superior dimensional dependence.
翻译:理解黑箱变分贝叶斯推断的梯度方差是建立其收敛性并发展算法改进的关键步骤。然而,现有研究尚未表明黑箱变分贝叶斯推断的梯度方差满足用于研究随机梯度下降收敛性的条件,而随机梯度下降是黑箱变分贝叶斯推断的核心工具。在本工作中,我们证明在应用于光滑且二次增长的对数似然时,黑箱变分贝叶斯推断满足与随机梯度下降文献中使用的$ABC$条件相对应的匹配界。我们的结果推广到黑箱变分贝叶斯推断实践中广泛使用的非线性协方差参数化形式。此外,我们证明平均场参数化的方差在维度依赖性上具有可证明的优越性。