GOEnFusion: Gradient Origin Encodings for 3D Forward Diffusion Models

The recently introduced Forward-Diffusion method allows to train a 3D diffusion model using only 2D images for supervision. However, it does not easily generalise to different 3D representations and requires a computationally expensive auto-regressive sampling process to generate the underlying 3D scenes. In this paper, we propose GOEn: Gradient Origin Encoding (pronounced "gone"). GOEn can encode input images into any type of 3D representation without the need to use a pre-trained image feature extractor. It can also handle single, multiple or no source view(s) alike, by design, and tries to maximise the information transfer from the views to the encodings. Our proposed GOEnFusion model pairs GOEn encodings with a realisation of the Forward-Diffusion model which addresses the limitations of the vanilla Forward-Diffusion realisation. We evaluate how much information the GOEn mechanism transfers to the encoded representations, and how well it captures the prior distribution over the underlying 3D scenes, through the lens of a partial AutoEncoder. Lastly, the efficacy of the GOEnFusion model is evaluated on the recently proposed OmniObject3D dataset while comparing to the state-of-the-art Forward and non-Forward-Diffusion models and other 3D generative models.

翻译：近期提出的正向扩散方法（Forward-Diffusion）允许仅使用二维图像作为监督来训练三维扩散模型。然而，该方法难以泛化到不同的三维表示形式，且需要计算成本高昂的自回归采样过程来生成底层三维场景。本文提出GOEn：梯度起源编码（英文发音同"gone"）。GOEn能够将输入图像编码为任意类型的三维表示，无需使用预训练的图像特征提取器。通过设计，它可灵活处理单视图、多视图或无源视图的情况，并力求最大化从视图到编码的信息传递。我们提出的GOEnFusion模型将GOEn编码与正向扩散模型的实现相结合，解决了原始正向扩散实现的局限性。通过部分自编码器的视角，我们评估了GOEn机制向编码表示传递的信息量，以及其对底层三维场景先验分布的捕捉能力。最后，在最新提出的OmniObject3D数据集上，通过与当前最先进的正向扩散/非正向扩散模型及其他三维生成模型的对比，验证了GOEnFusion模型的效能。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日