We prove that black-box variational inference (BBVI) with control variates, particularly the sticking-the-landing (STL) estimator, converges at a geometric (traditionally called "linear") rate under perfect variational family specification. In particular, we prove a quadratic bound on the gradient variance of the STL estimator, one which encompasses misspecified variational families. Combined with previous works on the quadratic variance condition, this directly implies convergence of BBVI with the use of projected stochastic gradient descent. We also improve existing analysis on the regular closed-form entropy gradient estimators, which enables comparison against the STL estimator and provides explicit non-asymptotic complexity guarantees for both.
翻译:我们证明,在完美变分族设定下,采用控制变量的黑箱变分推断(BBVI),特别是“着落”(STL)估计器,具有几何级(传统上称为“线性”)收敛速率。具体而言,我们推导出STL估计器梯度方差的二次上界,该上界涵盖了错误指定的变分族。结合先前关于二次方差条件的研究成果,这直接意味着采用投影随机梯度下降法的BBVI可实现收敛。我们还改进了现有关于规则闭式熵梯度估计器的分析,这使得与STL估计器的比较成为可能,并为两者提供了显式的非渐近复杂度保证。