Manually creating textures for 3D meshes is time-consuming, even for expert visual content creators. We propose a fast approach for automatically texturing an input 3D mesh based on a user-provided text prompt. Importantly, our approach disentangles lighting from surface material/reflectance in the resulting texture so that the mesh can be properly relit and rendered in any lighting environment. We introduce LightControlNet, a new text-to-image model based on the ControlNet architecture, which allows the specification of the desired lighting as a conditioning image to the model. Our text-to-texture pipeline then constructs the texture in two stages. The first stage produces a sparse set of visually consistent reference views of the mesh using LightControlNet. The second stage applies a texture optimization based on Score Distillation Sampling (SDS) that works with LightControlNet to increase the texture quality while disentangling surface material from lighting. Our pipeline is significantly faster than previous text-to-texture methods, while producing high-quality and relightable textures.
翻译:手动为3D网格创建纹理耗时巨大,即便是对于专业视觉内容创作者亦是如此。我们提出了一种基于用户文本提示快速为输入3D网格自动生成纹理的方法。关键在于,我们的方法能够将纹理中的光照与表面材质/反射率解耦,使网格可在任意光照环境下进行恰当重光照与渲染。我们引入了LightControlNet——一种基于ControlNet架构的新型文生图模型,该模型允许将期望光照作为条件图像输入模型。随后,我们的文本到纹理流水线分两阶段构建纹理:第一阶段利用LightControlNet生成网格在视觉上一致的稀疏参考视图集;第二阶段基于分数蒸馏采样(SDS)执行纹理优化,与LightControlNet协同工作,在解耦表面材质与光照的同时提升纹理质量。我们的流水线在生成高质量可重光照纹理的同时,速度显著优于先前的文本到纹理方法。