Diffusion with Forward Models: Solving Stochastic Inverse Problems Without Direct Supervision

Denoising diffusion models are a powerful type of generative models used to capture complex distributions of real-world signals. However, their applicability is limited to scenarios where training samples are readily available, which is not always the case in real-world applications. For example, in inverse graphics, the goal is to generate samples from a distribution of 3D scenes that align with a given image, but ground-truth 3D scenes are unavailable and only 2D images are accessible. To address this limitation, we propose a novel class of denoising diffusion probabilistic models that learn to sample from distributions of signals that are never directly observed. Instead, these signals are measured indirectly through a known differentiable forward model, which produces partial observations of the unknown signal. Our approach involves integrating the forward model directly into the denoising process. This integration effectively connects the generative modeling of observations with the generative modeling of the underlying signals, allowing for end-to-end training of a conditional generative model over signals. During inference, our approach enables sampling from the distribution of underlying signals that are consistent with a given partial observation. We demonstrate the effectiveness of our method on three challenging computer vision tasks. For instance, in the context of inverse graphics, our model enables direct sampling from the distribution of 3D scenes that align with a single 2D input image.

翻译：去噪扩散模型是一类强大的生成模型，用于捕捉真实世界信号的复杂分布。然而，其适用性仅限于训练样本可直接获取的场景，而这在实际应用中并不总是成立。例如，在逆图形学中，目标是生成与给定图像一致的3D场景分布样本，但真实3D场景不可获取，仅能获得2D图像。为解决这一局限，我们提出了一类新型去噪扩散概率模型，该模型学习从从未被直接观测到的信号分布中采样。这些信号通过已知的可微前向模型间接测量，该模型生成未知信号的部分观测结果。我们的方法将前向模型直接整合到去噪过程中。这种整合有效地将观测值的生成建模与底层信号的生成建模联系起来，从而实现对信号的条件生成模型的端到端训练。在推理阶段，我们的方法能够从与给定部分观测一致的底层信号分布中采样。我们在三个具有挑战性的计算机视觉任务上展示了该方法的有效性。例如，在逆图形学背景下，我们的模型能够直接从与单个2D输入图像一致的3D场景分布中采样。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日