Manually creating 3D environments for AR/VR applications is a complex process requiring expert knowledge in 3D modeling software. Pioneering works facilitate this process by generating room meshes conditioned on textual style descriptions. Yet, many of these automatically generated 3D meshes do not adhere to typical room layouts, compromising their plausibility, e.g., by placing several beds in one bedroom. To address these challenges, we present ControlRoom3D, a novel method to generate high-quality room meshes. Central to our approach is a user-defined 3D semantic proxy room that outlines a rough room layout based on semantic bounding boxes and a textual description of the overall room style. Our key insight is that when rendered to 2D, this 3D representation provides valuable geometric and semantic information to control powerful 2D models to generate 3D consistent textures and geometry that aligns well with the proxy room. Backed up by an extensive study including quantitative metrics and qualitative user evaluations, our method generates diverse and globally plausible 3D room meshes, thus empowering users to design 3D rooms effortlessly without specialized knowledge.
翻译:手动为AR/VR应用创建三维环境是一个复杂的过程,需要掌握三维建模软件的专业知识。开创性工作通过生成基于文本风格描述的房间网格模型简化了这一流程。然而,许多自动生成的三维网格模型并不符合典型房间布局,例如在卧室中放置多张床,从而损害了其合理性。针对这些挑战,我们提出ControlRoom3D——一种生成高质量房间网格模型的新方法。该方法的核心是用户定义的三维语义代理空间,该空间基于语义边界框和整体房间风格的文本描述勾勒出粗略的房间布局。关键发现在于,当将这种三维表示渲染为二维图像时,它能提供有价值的几何与语义信息,从而控制强大的二维模型生成与代理空间高度一致的三维一致纹理和几何结构。通过包含定量指标和定性用户评估的广泛研究验证,我们的方法能生成多样化且全局合理的二维房间网格模型,使无专业知识的用户也能轻松设计三维房间。