Black-box variational inference performance is sometimes hindered by the use of gradient estimators with high variance. This variance comes from two sources of randomness: Data subsampling and Monte Carlo sampling. While existing control variates only address Monte Carlo noise, and incremental gradient methods typically only address data subsampling, we propose a new "joint" control variate that jointly reduces variance from both sources of noise. This significantly reduces gradient variance, leading to faster optimization in several applications.
翻译:黑箱变分推断的性能有时受到高方差梯度估计器的阻碍。这种方差源于两种随机性来源:数据子采样和蒙特卡洛采样。现有控制变量仅能降低蒙特卡洛噪声,而增量梯度方法通常只处理数据子采样问题,为此我们提出一种新的“联合”控制变量,可同时降低两种噪声源带来的方差。该方法显著降低了梯度方差,从而在多个应用中实现了更快的优化速度。