Realistic 3D human generation from text prompts is a desirable yet challenging task. Existing methods optimize 3D representations like mesh or neural fields via score distillation sampling (SDS), which suffers from inadequate fine details or excessive training time. In this paper, we propose an efficient yet effective framework, HumanGaussian, that generates high-quality 3D humans with fine-grained geometry and realistic appearance. Our key insight is that 3D Gaussian Splatting is an efficient renderer with periodic Gaussian shrinkage or growing, where such adaptive density control can be naturally guided by intrinsic human structures. Specifically, 1) we first propose a Structure-Aware SDS that simultaneously optimizes human appearance and geometry. The multi-modal score function from both RGB and depth space is leveraged to distill the Gaussian densification and pruning process. 2) Moreover, we devise an Annealed Negative Prompt Guidance by decomposing SDS into a noisier generative score and a cleaner classifier score, which well addresses the over-saturation issue. The floating artifacts are further eliminated based on Gaussian size in a prune-only phase to enhance generation smoothness. Extensive experiments demonstrate the superior efficiency and competitive quality of our framework, rendering vivid 3D humans under diverse scenarios. Project Page: https://alvinliu0.github.io/projects/HumanGaussian
翻译:从文本提示生成逼真的三维人体是一项理想但具有挑战性的任务。现有方法通过分数蒸馏采样(SDS)优化网格或神经场等三维表示,但存在细节不足或训练时间过长的问题。本文提出高效且有效的框架HumanGaussian,可生成具有精细几何结构与逼真外观的高质量三维人体。关键思路在于三维高斯溅射是一种具备周期性高斯收缩或扩张特性的高效渲染器,而此类自适应密度控制可自然受人体内在结构引导。具体而言:1)首先提出结构感知SDS,同步优化人体外观与几何。利用来自RGB和深度空间的多模态分数函数引导高斯密集化与剪枝过程;2)进一步设计退火负提示引导,通过将SDS分解为含噪生成分数与更干净分类器分数,有效解决过饱和问题。基于高斯尺寸的仅剪枝阶段可消除浮动伪影,提升生成平滑性。大量实验证明本框架在多样场景下渲染生动三维人体时具备卓越效率与竞争力。项目页面:https://alvinliu0.github.io/projects/HumanGaussian