Score distillation of 2D diffusion models has proven to be a powerful mechanism to guide 3D optimization, for example enabling text-based 3D generation or single-view reconstruction. A common limitation of existing score distillation formulations, however, is that the outputs of the (mode-seeking) optimization are limited in diversity despite the underlying diffusion model being capable of generating diverse samples. In this work, inspired by the sampling process in denoising diffusion, we propose a score formulation that guides the optimization to follow generation paths defined by random initial seeds, thus ensuring diversity. We then present an approximation to adopt this formulation for scenarios where the optimization may not precisely follow the generation paths (e.g. a 3D representation whose renderings evolve in a co-dependent manner). We showcase the applications of our `Diverse Score Distillation' (DSD) formulation across tasks such as 2D optimization, text-based 3D inference, and single-view reconstruction. We also empirically validate DSD against prior score distillation formulations and show that it significantly improves sample diversity while preserving fidelity.
翻译:二维扩散模型的分数蒸馏已被证明是指导三维优化的强大机制,例如实现基于文本的三维生成或单视图重建。然而,现有分数蒸馏方法的一个常见局限是,尽管底层扩散模型能够生成多样化的样本,但(寻求模态的)优化输出在多样性方面受到限制。在这项工作中,受去噪扩散中采样过程的启发,我们提出了一种分数公式,该公式引导优化遵循由随机初始种子定义的生成路径,从而确保多样性。随后,我们提出了一种近似方法,以将此公式应用于优化可能无法精确遵循生成路径的场景(例如,其渲染以相互依赖的方式演化的三维表示)。我们在二维优化、基于文本的三维推理和单视图重建等任务中展示了我们提出的“多样化分数蒸馏”(DSD)公式的应用。我们还通过实验将DSD与先前的分数蒸馏公式进行比较,结果表明它在保持保真度的同时显著提高了样本多样性。