Gradient estimation -- approximating the gradient of an expectation with respect to the parameters of a distribution -- is central to the solution of many machine learning problems. However, when the distribution is discrete, most common gradient estimators suffer from excessive variance. To improve the quality of gradient estimation, we introduce a variance reduction technique based on Stein operators for discrete distributions. We then use this technique to build flexible control variates for the REINFORCE leave-one-out estimator. Our control variates can be adapted online to minimize variance and do not require extra evaluations of the target function. In benchmark generative modeling tasks such as training binary variational autoencoders, our gradient estimator achieves substantially lower variance than state-of-the-art estimators with the same number of function evaluations.
翻译:梯度估计——近似分布参数相对于期望的梯度——是许多机器学习问题求解的核心。然而,当分布为离散时,大多数常见的梯度估计器会出现过高的方差。为提升梯度估计质量,我们引入一种基于离散分布Stein算子的方差缩减技术,并利用该技术为REINFORCE留一法估计器构建灵活的控制变量。我们的控制变量可在线自适应调整以最小化方差,且无需额外计算目标函数。在训练二元变分自编码器等基准生成建模任务中,与同函数评估次数的先进估计器相比,我们的梯度估计器实现了显著更低的方差。