From Scores to Gibbs Correctors: Accelerating Uniform-Rate Discrete Diffusion Models

Discrete diffusion models have achieved strong empirical performance in text and other symbolic domains, but, especially for uniform-rate models, they often require many steps to generate a single sample. Existing acceleration methods either rely on training additional quantities or suffer from slow mixing. In this work, we propose a novel Gibbs-based corrector for discrete diffusion models, termed Gibbs-Accelerated Discrete Diffusion (GADD). GADD leverages the structure of the concrete score function to construct Gibbs posterior likelihoods directly, without requiring any additional training beyond standard score estimation. We show that GADD achieves an overall sampling complexity of $\mathcal{O}(\mathrm{polylog} (\varepsilon^{-1}))$, yielding the first such rate for diffusion-based samplers for uniform-rate discrete diffusion models. We also conduct numerical experiments demonstrating the practical advantages of GADD across synthetic data, zero-shot text sampling, and zero-shot conditional music generation. These results corroborate the theory and show that GADD consistently improves sample quality and wall-clock efficiency over standard baselines, including vanilla Euler methods and CTMC correctors. Beyond this, our theoretical analysis introduces a novel framework for analyzing predictor-corrector methods in discrete diffusion models, which may be of independent interest. Unlike existing approaches that rely on the Girsanov change-of-measure technique, our method is based on an induction argument that tracks error propagation across predictor iterations while accounting for inaccuracies in the corrector updates.

翻译：离散扩散模型在文本及其他符号领域取得了强大的实证效果，但尤其是在均匀速率模型中，生成单个样本通常需要大量步骤。现有加速方法要么依赖训练额外量，要么受限于缓慢的混合过程。本文提出一种新颖的基于吉布斯校正器的离散扩散模型方法，称为吉布斯加速离散扩散（GADD）。GADD利用具体分数函数的结构直接构建吉布斯后验似然，无需在标准分数估计之外进行任何额外训练。我们证明GADD实现了$\mathcal{O}(\mathrm{polylog} (\varepsilon^{-1}))$的整体采样复杂度，这是均匀速率离散扩散模型的基于扩散的采样器首次达到此速率。我们还通过数值实验，在合成数据、零样本文本采样和零样本条件音乐生成中展示了GADD的实践优势。这些结果验证了理论，并表明GADD在样本质量和实时效率上持续优于标准基线方法，包括经典欧拉方法和连续时间马尔可夫链校正器。此外，我们的理论分析引入了一个用于分析离散扩散模型中预测器-校正器方法的新框架，这本身可能具有独立意义。与现有依赖吉萨诺夫测度变化技术的方法不同，我们的方法基于归纳论证，在追踪预测器迭代中误差传播的同时，考虑校正器更新的不准确性。