We present DreamWaltz, a novel framework for generating and animating complex 3D avatars given text guidance and parametric human body prior. While recent methods have shown encouraging results for text-to-3D generation of common objects, creating high-quality and animatable 3D avatars remains challenging. To create high-quality 3D avatars, DreamWaltz proposes 3D-consistent occlusion-aware Score Distillation Sampling (SDS) to optimize implicit neural representations with canonical poses. It provides view-aligned supervision via 3D-aware skeleton conditioning which enables complex avatar generation without artifacts and multiple faces. For animation, our method learns an animatable and generalizable avatar representation which could map arbitrary poses to the canonical pose representation. Extensive evaluations demonstrate that DreamWaltz is an effective and robust approach for creating 3D avatars that can take on complex shapes and appearances as well as novel poses for animation. The proposed framework further enables the creation of complex scenes with diverse compositions, including avatar-avatar, avatar-object and avatar-scene interactions. See https://dreamwaltz3d.github.io/ for more vivid 3D avatar and animation results.
翻译:本文提出DreamWaltz框架,旨在通过文本引导和参数化人体先验生成并驱动复杂的三维化身。尽管近期方法在常见物体的文本到三维生成中取得了令人鼓舞的成果,但创建高质量且可动画化的三维化身仍具挑战性。为生成高质量三维化身,DreamWaltz提出一种三维一致且考虑遮挡的分数蒸馏采样(SDS)方法,以优化具有规范姿态的隐式神经表征。该方法通过三维感知骨架条件化提供视角对齐的监督,能够在无伪影和多面畸形的情况下生成复杂化身。在动画方面,我们学习了一种可动画化且泛化能力强的化身表征,可将任意姿态映射到规范姿态表征。大量实验表明,DreamWaltz是一种有效且稳健的创建三维化身的方法,能够生成复杂形状、外观及新颖动画姿态。该框架进一步支持构建包含化身-化身、化身-物体及化身-场景交互等多元组合的复杂场景。更多生动的三维化身及动画结果请参见https://dreamwaltz3d.github.io/。