Recent breakthroughs in text-guided image generation have significantly advanced the field of 3D generation. While generating a single high-quality 3D object is now feasible, generating multiple objects with reasonable interactions within a 3D space, a.k.a. compositional 3D generation, presents substantial challenges. This paper introduces CompGS, a novel generative framework that employs 3D Gaussian Splatting (GS) for efficient, compositional text-to-3D content generation. To achieve this goal, two core designs are proposed: (1) 3D Gaussians Initialization with 2D compositionality: We transfer the well-established 2D compositionality to initialize the Gaussian parameters on an entity-by-entity basis, ensuring both consistent 3D priors for each entity and reasonable interactions among multiple entities; (2) Dynamic Optimization: We propose a dynamic strategy to optimize 3D Gaussians using Score Distillation Sampling (SDS) loss. CompGS first automatically decomposes 3D Gaussians into distinct entity parts, enabling optimization at both the entity and composition levels. Additionally, CompGS optimizes across objects of varying scales by dynamically adjusting the spatial parameters of each entity, enhancing the generation of fine-grained details, particularly in smaller entities. Qualitative comparisons and quantitative evaluations on T3Bench demonstrate the effectiveness of CompGS in generating compositional 3D objects with superior image quality and semantic alignment over existing methods. CompGS can also be easily extended to controllable 3D editing, facilitating scene generation. We hope CompGS will provide new insights to the compositional 3D generation. Project page: https://chongjiange.github.io/compgs.html.
翻译:近期文本引导图像生成领域的突破显著推动了3D生成技术的发展。虽然生成单个高质量3D对象已具备可行性,但在3D空间内生成具有合理交互关系的多个对象(即组合式3D生成)仍面临重大挑战。本文提出CompGS,一种采用3D高斯溅射(GS)实现高效组合式文本到3D内容生成的新型生成框架。为实现该目标,我们提出两项核心设计:(1)基于2D组合性的3D高斯初始化:我们将成熟的2D组合性迁移至高斯参数初始化过程,以实体为单位进行初始化,确保每个实体具有一致的3D先验知识,同时保证多个实体间形成合理交互;(2)动态优化策略:提出采用分数蒸馏采样(SDS)损失动态优化3D高斯的策略。CompGS首先将3D高斯自动分解为不同实体部分,实现实体层级与组合层级的双重优化。此外,通过动态调整各实体的空间参数,CompGS能够优化不同尺度的对象,从而增强细粒度细节的生成能力,特别是在较小实体中。在T3Bench数据集上的定性比较与定量评估表明,CompGS在生成组合式3D对象方面具有卓越的图像质量与语义对齐能力,优于现有方法。CompGS还可轻松扩展至可控3D编辑任务,促进场景生成。我们期望CompGS能为组合式3D生成领域提供新的研究视角。项目页面:https://chongjiange.github.io/compgs.html。