Diffusion models for continuous data gained widespread adoption owing to their high quality generation and control mechanisms. However, controllable diffusion on discrete data faces challenges given that continuous guidance methods do not directly apply to discrete diffusion. Here, we provide a straightforward derivation of classifier-free and classifier-based guidance for discrete diffusion, as well as a new class of diffusion models that leverage uniform noise and that are more guidable because they can continuously edit their outputs. We improve the quality of these models with a novel continuous-time variational lower bound that yields state-of-the-art performance, especially in settings involving guidance or fast generation. Empirically, we demonstrate that our guidance mechanisms combined with uniform noise diffusion improve controllable generation relative to autoregressive and diffusion baselines on several discrete data domains, including genomic sequences, small molecule design, and discretized image generation.
翻译:连续数据的扩散模型因其高质量生成和控制机制而获得广泛应用。然而,离散数据的可控扩散面临挑战,因为连续引导方法无法直接应用于离散扩散。本文为离散扩散提供了分类器自由引导和基于分类器的引导的简明推导,并提出了一类利用均匀噪声的新型扩散模型,这类模型因其能够连续编辑输出而更具可引导性。我们通过一种新颖的连续时间变分下界提升了这些模型的质量,该下界在涉及引导或快速生成的场景中实现了最先进的性能。实证研究表明,我们的引导机制结合均匀噪声扩散,在多个离散数据领域(包括基因组序列、小分子设计和离散化图像生成)中,相较于自回归和扩散基线模型,显著提升了可控生成的质量。