Although Score Distillation Sampling (SDS) has exhibited remarkable performance in conditional 3D content generation, a comprehensive understanding of its formulation is still lacking, hindering the development of 3D generation. In this work, we decompose SDS as a combination of three functional components, namely mode-seeking, mode-disengaging and variance-reducing terms, analyzing the properties of each. We show that problems such as over-smoothness and implausibility result from the intrinsic deficiency of the first two terms and propose a more advanced variance-reducing term than that introduced by SDS. Based on the analysis, we propose a simple yet effective approach named Stable Score Distillation (SSD) which strategically orchestrates each term for high-quality 3D generation and can be readily incorporated to various 3D generation frameworks and 3D representations. Extensive experiments validate the efficacy of our approach, demonstrating its ability to generate high-fidelity 3D content without succumbing to issues such as over-smoothness.
翻译:尽管分数蒸馏采样(Score Distillation Sampling, SDS)在条件三维内容生成中展现了卓越性能,但其公式化表达仍缺乏全面理解,这阻碍了三维生成领域的发展。本研究将SDS分解为三个功能组件:模式寻优项、模式脱离项与方差缩减项,并逐一分析其特性。我们证明过度平滑与不可信等问题源于前两项的内在缺陷,并提出比SDS中采用的更先进的方差缩减项。基于上述分析,我们提出一种简洁高效的方法——稳定分数蒸馏(Stable Score Distillation, SSD),该方法通过策略性协调各组件实现高质量三维生成,并能便捷地集成至多种三维生成框架与三维表征中。大量实验验证了本方法的有效性,证明其能生成高保真三维内容,且不会出现过度平滑等问题。