Text-to-3D generation has made remarkable progress recently, particularly with methods based on Score Distillation Sampling (SDS) that leverages pre-trained 2D diffusion models. While the usage of classifier-free guidance is well acknowledged to be crucial for successful optimization, it is considered an auxiliary trick rather than the most essential component. In this paper, we re-evaluate the role of classifier-free guidance in score distillation and discover a surprising finding: the guidance alone is enough for effective text-to-3D generation tasks. We name this method Classifier Score Distillation (CSD), which can be interpreted as using an implicit classification model for generation. This new perspective reveals new insights for understanding existing techniques. We validate the effectiveness of CSD across a variety of text-to-3D tasks including shape generation, texture synthesis, and shape editing, achieving results superior to those of state-of-the-art methods. Our project page is https://xinyu-andy.github.io/Classifier-Score-Distillation
翻译:文本到3D生成技术近期取得了显著进展,尤其是基于分数蒸馏采样(SDS)的方法,这类方法利用预训练的二维扩散模型。尽管公认无分类器引导对成功优化至关重要,但它常被视为辅助性技巧而非核心要素。本文重新评估了无分类器引导在分数蒸馏中的作用,并发现一个令人惊讶的现象:仅凭引导本身即可有效完成文本到3D生成任务。我们将该方法命名为分类器分数蒸馏(CSD),可将其解释为利用隐式分类模型进行生成。这一新视角为理解现有技术提供了全新洞见。我们在形状生成、纹理合成及形状编辑等多种文本到3D任务中验证了CSD的有效性,其效果优于当前最先进方法。项目页面:https://xinyu-andy.github.io/Classifier-Score-Distillation