Learning models with categorical variables requires optimizing expectations over discrete distributions, a setting in which stochastic gradient-based optimization is challenging due to the non-differentiability of categorical sampling. A common workaround is to replace the discrete distribution with a continuous relaxation, yielding a smooth surrogate that admits reparameterized gradient estimates via the reparameterization trick. Building on this idea, we introduce ReDGE, a novel and efficient diffusion-based soft reparameterization method for categorical distributions. Our approach defines a flexible class of gradient estimators that includes the Straight-Through estimator as a special case. Experiments spanning latent variable models and inference-time reward guidance in discrete diffusion models demonstrate that ReDGE consistently matches or outperforms existing gradient-based methods. The code will be made available at https://github.com/samsongourevitch/redge.
翻译:学习包含类别变量的模型需要优化离散分布上的期望,由于类别采样的不可微性,基于随机梯度的优化在此设定下具有挑战性。一种常见的解决方案是用连续松弛替代离散分布,从而得到一个平滑的代理目标,该目标允许通过重参数化技巧获得重参数化的梯度估计。基于这一思想,我们提出了ReDGE,一种新颖高效的、基于扩散的类别分布软重参数化方法。我们的方法定义了一类灵活的梯度估计器,其中将直通估计器作为一个特例包含在内。在潜变量模型和离散扩散模型推理时奖励引导的一系列实验中,ReDGE始终匹配或优于现有的基于梯度的方法。代码将在 https://github.com/samsongourevitch/redge 公开。