Denoising Diffusion Bridge Models

Diffusion models are powerful generative models that map noise to data using stochastic processes. However, for many applications such as image editing, the model input comes from a distribution that is not random noise. As such, diffusion models must rely on cumbersome methods like guidance or projected sampling to incorporate this information in the generative process. In our work, we propose Denoising Diffusion Bridge Models (DDBMs), a natural alternative to this paradigm based on diffusion bridges, a family of processes that interpolate between two paired distributions given as endpoints. Our method learns the score of the diffusion bridge from data and maps from one endpoint distribution to the other by solving a (stochastic) differential equation based on the learned score. Our method naturally unifies several classes of generative models, such as score-based diffusion models and OT-Flow-Matching, allowing us to adapt existing design and architectural choices to our more general problem. Empirically, we apply DDBMs to challenging image datasets in both pixel and latent space. On standard image translation problems, DDBMs achieve significant improvement over baseline methods, and, when we reduce the problem to image generation by setting the source distribution to random noise, DDBMs achieve comparable FID scores to state-of-the-art methods despite being built for a more general task.

翻译：扩散模型是强大的生成模型，通过随机过程将噪声映射为数据。然而，对于图像编辑等众多应用，模型输入并非随机噪声，而是来自特定分布。因此，扩散模型必须依赖引导或投影采样等复杂方法，在生成过程中融入这些信息。我们提出了一种基于扩散桥的自然替代范式——去噪扩散桥模型（DDBMs）。扩散桥是一类在两个给定端点分布之间进行插值的过程。该方法从数据中学习扩散桥的得分，并通过基于所学得分求解（随机）微分方程，实现从一个端点分布到另一个端点分布的映射。该方法自然地统一了基于得分的扩散模型和最优传输流匹配（OT-Flow-Matching）等生成模型类别，使我们能够将现有设计和架构选择适配到更通用的任务。实验方面，我们在像素空间和隐空间中对具有挑战性的图像数据集应用了DDBMs。在标准图像翻译问题中，DDBMs相比基线方法取得了显著改进；而当通过将源分布设为随机噪声将问题简化为图像生成时，DDBMs在更通用任务设定下仍取得了与最先进方法相当的FID分数。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日