In this paper, we present TEXTure, a novel method for text-guided generation, editing, and transfer of textures for 3D shapes. Leveraging a pretrained depth-to-image diffusion model, TEXTure applies an iterative scheme that paints a 3D model from different viewpoints. Yet, while depth-to-image models can create plausible textures from a single viewpoint, the stochastic nature of the generation process can cause many inconsistencies when texturing an entire 3D object. To tackle these problems, we dynamically define a trimap partitioning of the rendered image into three progression states, and present a novel elaborated diffusion sampling process that uses this trimap representation to generate seamless textures from different views. We then show that one can transfer the generated texture maps to new 3D geometries without requiring explicit surface-to-surface mapping, as well as extract semantic textures from a set of images without requiring any explicit reconstruction. Finally, we show that TEXTure can be used to not only generate new textures but also edit and refine existing textures using either a text prompt or user-provided scribbles. We demonstrate that our TEXTuring method excels at generating, transferring, and editing textures through extensive evaluation, and further close the gap between 2D image generation and 3D texturing.
翻译:本文提出TEXTure,一种文本引导下用于3D形状纹理生成、编辑与迁移的新方法。通过利用预训练的深度图到图像扩散模型,TEXTure采用迭代方案从不同视角对3D模型进行着色。然而,尽管深度图到图像模型能够从单一视角生成看似合理的纹理,当为整个3D对象进行纹理映射时,生成过程的随机性会导致诸多不一致性。为解决这些问题,我们动态定义了一个将渲染图像划分为三个渐进状态的修剪图分区,并提出一种新颖的精细化扩散采样过程,该过程利用修剪图表示从不同视角生成无缝纹理。我们进一步证明,无需显式的表面对表面映射,即可将生成的纹理图迁移至新3D几何体,同时能从一组图像中提取语义纹理而无需任何显式重建。最后,我们展示TEXTure不仅可用于生成新纹理,还能通过文本提示或用户提供的草图对现有纹理进行编辑与优化。通过广泛评估,我们证明该纹理化方法在纹理生成、迁移与编辑方面表现卓越,并进一步缩小了2D图像生成与3D纹理化之间的差距。