Gradient estimation -- approximating the gradient of an expectation with respect to the parameters of a distribution -- is central to the solution of many machine learning problems. However, when the distribution is discrete, most common gradient estimators suffer from excessive variance. To improve the quality of gradient estimation, we introduce a variance reduction technique based on Stein operators for discrete distributions. We then use this technique to build flexible control variates for the REINFORCE leave-one-out estimator. Our control variates can be adapted online to minimize variance and do not require extra evaluations of the target function. In benchmark generative modeling tasks such as training binary variational autoencoders, our gradient estimator achieves substantially lower variance than state-of-the-art estimators with the same number of function evaluations.
翻译:梯度估计——即对以分布参数为变量的期望梯度的近似——是许多机器学习问题求解的核心。然而,当分布为离散分布时,大多数常见梯度估计器会遭受过大的方差。为提升梯度估计质量,我们引入了一种基于离散分布斯坦算子的方差缩减技术。随后利用该技术为REINFORCE留一法估计器构建了灵活的控制变量。我们的控制变量可在线自适应调整以最小化方差,且无需额外对目标函数求值。在训练二值变分自编码器等基准生成建模任务中,我们的梯度估计器在相同函数求值次数下实现了显著低于现有最优估计器的方差。