Diffusion-based methods have achieved prominent success in generating 2D media. However, accomplishing similar proficiencies for scene-level mesh texturing in 3D spatial applications, e.g., XR/VR, remains constrained, primarily due to the intricate nature of 3D geometry and the necessity for immersive free-viewpoint rendering. In this paper, we propose a novel indoor scene texturing framework, which delivers text-driven texture generation with enchanting details and authentic spatial coherence. The key insight is to first imagine a stylized 360{\deg} panoramic texture from the central viewpoint of the scene, and then propagate it to the rest areas with inpainting and imitating techniques. To ensure meaningful and aligned textures to the scene, we develop a novel coarse-to-fine panoramic texture generation approach with dual texture alignment, which both considers the geometry and texture cues of the captured scenes. To survive from cluttered geometries during texture propagation, we design a separated strategy, which conducts texture inpainting in confidential regions and then learns an implicit imitating network to synthesize textures in occluded and tiny structural areas. Extensive experiments and the immersive VR application on real-world indoor scenes demonstrate the high quality of the generated textures and the engaging experience on VR headsets. Project webpage: https://ybbbbt.com/publication/dreamspace
翻译:基于扩散的方法在二维媒体生成中取得了显著成功。然而,在三维空间应用(如XR/VR)中实现场景级网格纹理的类似能力仍受限制,这主要源于三维几何的复杂性以及沉浸式自由视角渲染的必要性。本文提出了一种新颖的室内场景纹理框架,能够通过文本驱动生成具有迷人细节和真实空间一致性的纹理。核心思路是:首先从场景中心视角构想出风格化的360°全景纹理,然后通过修复与模仿技术将其传播至其余区域。为生成与场景对齐的有意义纹理,我们开发了一种新颖的粗到细全景纹理生成方法,该方法同时考虑捕获场景的几何与纹理线索,并实现双重纹理对齐。为解决纹理传播过程中杂乱几何结构带来的挑战,我们设计了一种分离式策略:先在可信区域进行纹理修复,再学习隐式模仿网络以合成遮挡区域和微小结构区域的纹理。大量实验以及在真实室内场景上的沉浸式VR应用表明,该方法生成的纹理质量卓越,且能在VR头显上提供引人入胜的体验。项目网站:https://ybbbbt.com/publication/dreamspace