We present InstructHumans, a novel framework for instruction-driven 3D human texture editing. Existing text-based editing methods use Score Distillation Sampling (SDS) to distill guidance from generative models. This work shows that naively using such scores is harmful to editing as they destroy consistency with the source avatar. Instead, we propose an alternate SDS for Editing (SDS-E) that selectively incorporates subterms of SDS across diffusion timesteps. We further enhance SDS-E with spatial smoothness regularization and gradient-based viewpoint sampling to achieve high-quality edits with sharp and high-fidelity detailing. InstructHumans significantly outperforms existing 3D editing methods, consistent with the initial avatar while faithful to the textual instructions. Project page: https://jyzhu.top/instruct-humans .
翻译:我们提出InstructHumans,一种面向指令驱动的3D人体纹理编辑的新框架。现有基于文本的编辑方法采用分数蒸馏采样(Score Distillation Sampling, SDS)从生成模型中提取指导信号。本研究表明,直接使用此类分数会损害编辑效果,因其会破坏与源虚拟角色的一致性。为此,我们提出一种替代方案——编辑用SDS(SDS-E),该方法在扩散时间步长上选择性整合SDS的子项。进一步,我们通过空间平滑正则化和基于梯度的视点采样增强SDS-E,以实现具有锐利且高保真细节的高质量编辑。InstructHumans显著优于现有3D编辑方法,在忠实于文本指令的同时保持与初始虚拟角色的一致性。项目页面:https://jyzhu.top/instruct-humans。