Realistic 3D human generation from text prompts is a desirable yet challenging task. Existing methods optimize 3D representations like mesh or neural fields via score distillation sampling (SDS), which suffers from inadequate fine details or excessive training time. In this paper, we propose an efficient yet effective framework, HumanGaussian, that generates high-quality 3D humans with fine-grained geometry and realistic appearance. Our key insight is that 3D Gaussian Splatting is an efficient renderer with periodic Gaussian shrinkage or growing, where such adaptive density control can be naturally guided by intrinsic human structures. Specifically, 1) we first propose a Structure-Aware SDS that simultaneously optimizes human appearance and geometry. The multi-modal score function from both RGB and depth space is leveraged to distill the Gaussian densification and pruning process. 2) Moreover, we devise an Annealed Negative Prompt Guidance by decomposing SDS into a noisier generative score and a cleaner classifier score, which well addresses the over-saturation issue. The floating artifacts are further eliminated based on Gaussian size in a prune-only phase to enhance generation smoothness. Extensive experiments demonstrate the superior efficiency and competitive quality of our framework, rendering vivid 3D humans under diverse scenarios. Project Page: https://alvinliu0.github.io/projects/HumanGaussian
翻译:从文本提示生成逼真的三维人体是一项理想但具有挑战性的任务。现有方法通过分数蒸馏采样(SDS)优化网格或神经场等三维表征,但存在细节不足或训练时间过长的问题。本文提出一种高效且有效的框架HumanGaussian,能够生成具有精细几何结构和逼真外观的高质量三维人体。我们的核心洞察在于:三维高斯溅射是一种高效的渲染器,具有周期性高斯膨胀或收缩的特性,而这种自适应密度控制可自然地被人体内在结构所引导。具体而言:1)我们首先提出一种结构感知的SDS方法,同步优化人体外观与几何结构。通过利用来自RGB和深度空间的多模态分数函数,引导高斯密化与剪枝过程。2)此外,我们通过将SDS分解为噪声更大的生成分数和更清晰的分类器分数,设计了退火负提示引导,有效解决了过饱和问题。在仅剪枝阶段基于高斯尺寸进一步消除浮动伪影,以增强生成平滑性。大量实验证明了本框架的卓越效率和竞争性质量,能够在多样化场景中生成生动的三维人体。项目页面:https://alvinliu0.github.io/projects/HumanGaussian