The techniques for 3D indoor scene capturing are widely used, but the meshes produced leave much to be desired. In this paper, we propose "RoomDreamer", which leverages powerful natural language to synthesize a new room with a different style. Unlike existing image synthesis methods, our work addresses the challenge of synthesizing both geometry and texture aligned to the input scene structure and prompt simultaneously. The key insight is that a scene should be treated as a whole, taking into account both scene texture and geometry. The proposed framework consists of two significant components: Geometry Guided Diffusion and Mesh Optimization. Geometry Guided Diffusion for 3D Scene guarantees the consistency of the scene style by applying the 2D prior to the entire scene simultaneously. Mesh Optimization improves the geometry and texture jointly and eliminates the artifacts in the scanned scene. To validate the proposed method, real indoor scenes scanned with smartphones are used for extensive experiments, through which the effectiveness of our method is demonstrated.
翻译:三维室内场景捕捉技术被广泛应用,但其生成的网格模型仍存在诸多不足。本文提出"RoomDreamer"方法,利用强大的自然语言能力合成具有不同风格的新场景。与现有图像合成方法不同,我们的工作解决了同时合成与输入场景结构及文本提示对齐的几何与纹理这一挑战。核心思想在于将场景视为整体,同时考虑场景纹理与几何。所提出的框架包含两个重要组件:几何引导扩散与网格优化。面向三维场景的几何引导扩散通过将二维先验同时应用于整个场景,确保了场景风格的一致性;网格优化协同改进了几何与纹理质量,并消除了扫描场景中的伪影。为验证该方法,我们采用智能手机扫描的真实室内场景进行了大量实验,实验结果证明了该方法的有效性。