Categorical random variables can faithfully represent the discrete and uncertain aspects of data as part of a discrete latent variable model. Learning in such models necessitates taking gradients with respect to the parameters of the categorical probability distributions, which is often intractable due to their combinatorial nature. A popular technique to estimate these otherwise intractable gradients is the Log-Derivative trick. This trick forms the basis of the well-known REINFORCE gradient estimator and its many extensions. While the Log-Derivative trick allows us to differentiate through samples drawn from categorical distributions, it does not take into account the discrete nature of the distribution itself. Our first contribution addresses this shortcoming by introducing the CatLog-Derivative trick - a variation of the Log-Derivative trick tailored towards categorical distributions. Secondly, we use the CatLog-Derivative trick to introduce IndeCateR, a novel and unbiased gradient estimator for the important case of products of independent categorical distributions with provably lower variance than REINFORCE. Thirdly, we empirically show that IndeCateR can be efficiently implemented and that its gradient estimates have significantly lower bias and variance for the same number of samples compared to the state of the art.
翻译:类别随机变量作为离散潜变量模型的一部分,能够忠实表示数据的离散性和不确定性。此类模型的学习需要对类别概率分布的参数进行梯度计算,但由于其组合特性,这一过程通常难以实现。一种广泛应用的梯度估计技术是Log导数技巧,该技巧构成了著名的REINFORCE梯度估计器及其众多扩展的基础。尽管Log导数技巧支持对类别分布采样结果进行微分,却未考虑分布本身的离散特性。我们的第一个贡献在于提出CatLog导数技巧——针对类别分布改进的Log导数技巧变体,以弥补该缺陷。其次,我们基于CatLog导数技巧引入IndeCateR——一种针对独立类别分布乘积重要情形设计的新型无偏梯度估计器,其方差理论上低于REINFORCE。最后,实验证明IndeCateR可高效实现,且在相同样本数量下,其梯度估计的偏差与方差均显著优于现有最优方法。