Mixed Diffusion for 3D Indoor Scene Synthesis

Realistic conditional 3D scene synthesis significantly enhances and accelerates the creation of virtual environments, which can also provide extensive training data for computer vision and robotics research among other applications. Diffusion models have shown great performance in related applications, e.g., making precise arrangements of unordered sets. However, these models have not been fully explored in floor-conditioned scene synthesis problems. We present MiDiffusion, a novel mixed discrete-continuous diffusion model architecture, designed to synthesize plausible 3D indoor scenes from given room types, floor plans, and potentially pre-existing objects. We represent a scene layout by a 2D floor plan and a set of objects, each defined by its category, location, size, and orientation. Our approach uniquely implements structured corruption across the mixed discrete semantic and continuous geometric domains, resulting in a better conditioned problem for the reverse denoising step. We evaluate our approach on the 3D-FRONT dataset. Our experimental results demonstrate that MiDiffusion substantially outperforms state-of-the-art autoregressive and diffusion models in floor-conditioned 3D scene synthesis. In addition, our models can handle partial object constraints via a corruption-and-masking strategy without task specific training. We show MiDiffusion maintains clear advantages over existing approaches in scene completion and furniture arrangement experiments.

翻译：逼真的条件式三维场景合成极大地增强并加速了虚拟环境的创建，同时也能为计算机视觉和机器人研究等应用提供大量训练数据。扩散模型在相关应用中已展现出卓越性能，例如对无序集合进行精确排列。然而，这些模型在基于平面图条件的场景合成问题中尚未得到充分探索。本文提出MiDiffusion——一种新颖的混合离散-连续扩散模型架构，旨在根据给定的房间类型、平面图以及可能存在的预置物体合成合理的三维室内场景。我们通过二维平面图和一组物体来表示场景布局，每个物体由其类别、位置、尺寸和方向定义。我们的方法独特地在混合的离散语义域和连续几何域中实施结构化噪声扰动，从而为反向去噪步骤构建了条件更优的问题形式。我们在3D-FRONT数据集上评估了所提方法。实验结果表明，MiDiffusion在基于平面图条件的三维场景合成任务中显著优于当前最先进的自回归模型和扩散模型。此外，通过噪声扰动与掩码策略，我们的模型能够处理部分物体约束而无需针对特定任务进行训练。实验证明，MiDiffusion在场景补全与家具布局任务中较现有方法保持明显优势。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/