Manually creating textures for 3D meshes is time-consuming, even for expert visual content creators. We propose a fast approach for automatically texturing an input 3D mesh based on a user-provided text prompt. Importantly, our approach disentangles lighting from surface material/reflectance in the resulting texture so that the mesh can be properly relit and rendered in any lighting environment. We introduce LightControlNet, a new text-to-image model based on the ControlNet architecture, which allows the specification of the desired lighting as a conditioning image to the model. Our text-to-texture pipeline then constructs the texture in two stages. The first stage produces a sparse set of visually consistent reference views of the mesh using LightControlNet. The second stage applies a texture optimization based on Score Distillation Sampling (SDS) that works with LightControlNet to increase the texture quality while disentangling surface material from lighting. Our algorithm is significantly faster than previous text-to-texture methods, while producing high-quality and relightable textures.
翻译:手动为三维网格创建纹理即使对于专业视觉内容创作者而言也非常耗时。我们提出了一种快速方法,可根据用户提供的文本提示自动为输入的三维网格生成纹理。重要的是,我们的方法将光照与表面材质/反射率在所得纹理中分离,从而使网格能够在任何光照环境下进行适当重光照和渲染。我们引入LightControlNet,一种基于ControlNet架构的新型文本到图像模型,该模型允许将所需光照作为条件图像输入模型。随后,我们的文本到纹理流程分两个阶段构建纹理:第一阶段使用LightControlNet生成网格的一系列稀疏但视觉一致的参考视图;第二阶段基于得分蒸馏采样(SDS)进行纹理优化,并与LightControlNet协同工作,在分离表面材质与光照的同时提升纹理质量。我们的算法相比现有文本到纹理方法速度显著提升,同时能够生成高质量且可重光照的纹理。