Texture editing is a crucial task in 3D modeling that allows users to automatically manipulate the surface materials of 3D models. However, the inherent complexity of 3D models and the ambiguous text description lead to the challenge in this task. To address this challenge, we propose ITEM3D, a \textbf{T}exture \textbf{E}diting \textbf{M}odel designed for automatic \textbf{3D} object editing according to the text \textbf{I}nstructions. Leveraging the diffusion models and the differentiable rendering, ITEM3D takes the rendered images as the bridge of text and 3D representation, and further optimizes the disentangled texture and environment map. Previous methods adopted the absolute editing direction namely score distillation sampling (SDS) as the optimization objective, which unfortunately results in the noisy appearance and text inconsistency. To solve the problem caused by the ambiguous text, we introduce a relative editing direction, an optimization objective defined by the noise difference between the source and target texts, to release the semantic ambiguity between the texts and images. Additionally, we gradually adjust the direction during optimization to further address the unexpected deviation in the texture domain. Qualitative and quantitative experiments show that our ITEM3D outperforms the state-of-the-art methods on various 3D objects. We also perform text-guided relighting to show explicit control over lighting. Our project page: https://shengqiliu1.github.io/ITEM3D.
翻译:纹理编辑是3D建模中的关键任务,允许用户自动操控3D模型表面的材质属性。然而,3D模型固有的复杂性和模糊的文本描述给该任务带来了挑战。为解决这一问题,我们提出ITEM3D——一种根据文本指令自动编辑3D对象的\textbf{T}exture \textbf{E}diting \textbf{M}odel(纹理编辑模型)。通过结合扩散模型与可微渲染,ITEM3D将渲染图像作为文本与3D表征之间的桥梁,并进一步优化解耦后的纹理与环境贴图。现有方法采用绝对编辑方向,即分数蒸馏采样(SDS)作为优化目标,这会导致噪声外观和文本不一致性。为缓解模糊文本引起的问题,我们引入相对编辑方向——一种由源文本与目标文本之间噪声差异定义的优化目标,以释放文本与图像之间的语义歧义。此外,我们在优化过程中逐步调整该方向,以进一步修正纹理域中的意外偏差。定性与定量实验表明,我们的ITEM3D在多种3D对象上优于现有最优方法。我们还实现了文本引导的重光照,展示对光照的显式控制。项目页面:https://shengqiliu1.github.io/ITEM3D。