Black-box variational inference performance is sometimes hindered by the use of gradient estimators with high variance. This variance comes from two sources of randomness: Data subsampling and Monte Carlo sampling. While existing control variates only address Monte Carlo noise, and incremental gradient methods typically only address data subsampling, we propose a new "joint" control variate that jointly reduces variance from both sources of noise. This significantly reduces gradient variance, leading to faster optimization in several applications.
翻译:黑箱变分推断的性能有时受限于高方差梯度估计器的使用。这种方差来源于两种随机性:数据子采样和蒙特卡洛采样。现有的控制变量仅能处理蒙特卡洛噪声,而增量梯度方法通常只解决数据子采样问题,我们提出一种新型"联合"控制变量,可同时降低两种噪声源的方差。这显著减少了梯度方差,从而在多个应用中实现更快的优化。