Score distillation sampling (SDS) has emerged as an effective framework in text-driven 3D editing tasks due to its inherent 3D consistency. However, existing SDS-based 3D editing methods suffer from extensive training time and lead to low-quality results, primarily because these methods deviate from the sampling dynamics of diffusion models. In this paper, we propose DreamCatalyst, a novel framework that interprets SDS-based editing as a diffusion reverse process. Our objective function considers the sampling dynamics, thereby making the optimization process of DreamCatalyst an approximation of the diffusion reverse process in editing tasks. DreamCatalyst aims to reduce training time and improve editing quality. DreamCatalyst presents two modes: (1) a faster mode, which edits the NeRF scene in only about 25 minutes, and (2) a high-quality mode, which produces superior results in less than 70 minutes. Specifically, our high-quality mode outperforms current state-of-the-art NeRF editing methods both in terms of speed and quality. See more extensive results on our project page: https://dream-catalyst.github.io.
翻译:基于分数蒸馏采样(SDS)的方法因其固有的三维一致性,已成为文本驱动三维编辑任务的有效框架。然而,现有的基于SDS的三维编辑方法存在训练时间过长且生成质量低下的问题,这主要是因为这些方法偏离了扩散模型的采样动态。本文提出DreamCatalyst,一个将基于SDS的编辑解释为扩散逆过程的新框架。我们的目标函数考虑了采样动态,从而使DreamCatalyst的优化过程近似于编辑任务中的扩散逆过程。DreamCatalyst旨在减少训练时间并提升编辑质量。该框架提供两种模式:(1)快速模式,可在约25分钟内完成NeRF场景的编辑;(2)高质量模式,能在70分钟内生成更优的结果。具体而言,我们的高质量模式在速度和质量上均优于当前最先进的NeRF编辑方法。更多详细结果请见项目页面:https://dream-catalyst.github.io。