Guidance provides a simple and effective framework for posterior sampling by steering the generation process towards the desired distribution. When modeling discrete data, existing approaches mostly focus on guidance with the first-order approximation to improve the sampling efficiency. However, such an approximation is inappropriate in discrete state spaces since the approximation error could be large. A novel guidance framework for discrete data is proposed to address this problem: we derive the exact transition rate for the desired distribution given a learned discrete flow matching model, leading to guidance that only requires a single forward pass in each sampling step, significantly improving efficiency. This unified novel framework is general enough, encompassing existing guidance methods as special cases, and it can also be seamlessly applied to the masked diffusion model. We demonstrate the effectiveness of our proposed guidance on energy-guided simulations and preference alignment on text-to-image generation and multimodal understanding tasks. The code is available at https://github.com/WanZhengyan/Discrete-Guidance-Matching.
翻译:引导提供了一种简单且有效的后验采样框架,通过将生成过程导向目标分布来实现。在处理离散数据时,现有方法主要关注基于一阶近似的引导以提升采样效率。然而,这种近似在离散状态空间中并不适用,因为近似误差可能较大。为解决这一问题,本文提出了一种针对离散数据的新型引导框架:我们推导了给定学习到的离散流匹配模型下目标分布的精确转移速率,从而得到的引导在每个采样步骤仅需一次前向传播,显著提升了效率。这一统一的新型框架具有广泛适用性,涵盖了现有引导方法作为特例,并可无缝应用于掩码扩散模型。我们在能量引导模拟以及文本到图像生成与多模态理解任务的偏好对齐上验证了所提引导方法的有效性。代码已开源:https://github.com/WanZhengyan/Discrete-Guidance-Matching。