Discrete diffusion models have become highly effective across various domains. However, real-world applications often require the generative process to adhere to certain constraints but without task-specific fine-tuning. To this end, we propose a training-free method based on Sequential Monte Carlo (SMC) to sample from the reward-aligned target distribution at the test time. Our approach leverages twisted SMC with an approximate locally optimal proposal, obtained via a first-order Taylor expansion of the reward function. To address the challenge of ill-defined gradients in discrete spaces, we incorporate a Gumbel-Softmax relaxation, enabling efficient gradient-based approximation within the discrete generative framework. Empirical results on both synthetic datasets and image modelling validate the effectiveness of our approach.
翻译:离散扩散模型已在多个领域展现出卓越性能。然而,实际应用通常要求生成过程满足特定约束,却无法进行任务专用的微调。为此,我们提出一种基于序列蒙特卡洛(SMC)的无训练方法,在测试时从奖励对齐的目标分布中进行采样。该方法采用带扭曲权重的SMC,并通过奖励函数的一阶泰勒展开获得近似局部最优的提议分布。针对离散空间中梯度定义不明确的挑战,我们引入Gumbel-Softmax松弛技术,从而在离散生成框架内实现高效的基于梯度的近似。在合成数据集和图像建模任务上的实证结果验证了本方法的有效性。