Understanding the gradient variance of black-box variational inference (BBVI) is a crucial step for establishing its convergence and developing algorithmic improvements. However, existing studies have yet to show that the gradient variance of BBVI satisfies the conditions used to study the convergence of stochastic gradient descent (SGD), the workhorse of BBVI. In this work, we show that BBVI satisfies a matching bound corresponding to the $ABC$ condition used in the SGD literature when applied to smooth and quadratically-growing log-likelihoods. Our results generalize to nonlinear covariance parameterizations widely used in the practice of BBVI. Furthermore, we show that the variance of the mean-field parameterization has provably superior dimensional dependence.
翻译:理解黑盒变分贝叶斯推断(BBVI)的梯度方差,是确立其收敛性并推动算法改进的关键步骤。然而,现有研究尚未证明BBVI的梯度方差满足用于研究随机梯度下降(SGD,BBVI的核心算法)收敛性的条件。本文证明,当应用于光滑且二次增长的对数似然函数时,BBVI满足与SGD文献中使用的$ABC$条件对应的匹配界。我们的结论可推广至BBVI实践中广泛使用的非线性协方差参数化方法。此外,研究表明,平均场参数化下的方差在维度依赖性上具有可证明的优越性。