We present DiffuScene for indoor 3D scene synthesis based on a novel scene configuration denoising diffusion model. It generates 3D instance properties stored in an unordered object set and retrieves the most similar geometry for each object configuration, which is characterized as a concatenation of different attributes, including location, size, orientation, semantics, and geometry features. We introduce a diffusion network to synthesize a collection of 3D indoor objects by denoising a set of unordered object attributes. Unordered parametrization simplifies and eases the joint distribution approximation. The shape feature diffusion facilitates natural object placements, including symmetries. Our method enables many downstream applications, including scene completion, scene arrangement, and text-conditioned scene synthesis. Experiments on the 3D-FRONT dataset show that our method can synthesize more physically plausible and diverse indoor scenes than state-of-the-art methods. Extensive ablation studies verify the effectiveness of our design choice in scene diffusion models.
翻译:我们提出DiffuScene,该模型基于新颖的场景配置去噪扩散模型实现室内3D场景合成。它生成存储在无序对象集中的3D实例属性,并为每种对象配置检索最相似的几何体,该配置被表征为位置、尺寸、朝向、语义和几何特征等不同属性的拼接。我们引入扩散网络,通过对无序对象属性集进行去噪来合成一组3D室内对象。无序参数化简化并便于联合分布近似。形状特征扩散有助于自然对象布局(包括对称性)的实现。我们的方法支持多种下游应用,包括场景补全、场景布局及文本条件场景合成。在3D-FRONT数据集上的实验表明,本方法能合成比现有最优方法更符合物理规律且更多样化的室内场景。广泛的消融研究验证了场景扩散模型中设计选择的有效性。