This paper aims to introduce 3D Gaussian for efficient, expressive, and editable digital avatar generation. This task faces two major challenges: (1) The unstructured nature of 3D Gaussian makes it incompatible with current generation pipelines; (2) the expressive animation of 3D Gaussian in a generative setting that involves training with multiple subjects remains unexplored. In this paper, we propose a novel avatar generation method named $E^3$Gen, to effectively address these challenges. First, we propose a novel generative UV features plane representation that encodes unstructured 3D Gaussian onto a structured 2D UV space defined by the SMPL-X parametric model. This novel representation not only preserves the representation ability of the original 3D Gaussian but also introduces a shared structure among subjects to enable generative learning of the diffusion model. To tackle the second challenge, we propose a part-aware deformation module to achieve robust and accurate full-body expressive pose control. Extensive experiments demonstrate that our method achieves superior performance in avatar generation and enables expressive full-body pose control and editing.
翻译:本文旨在引入三维高斯模型,以实现高效、富有表现力且可编辑的数字虚拟化身生成。该任务面临两大挑战:(1) 三维高斯模型的无结构特性使其与当前生成流程不兼容;(2) 在涉及多主体训练的生成式设置下,三维高斯模型的富有表现力的动画生成尚未得到探索。本文提出一种名为 E^3Gen 的新型虚拟化身生成方法,以有效应对这些挑战。首先,我们提出一种新颖的生成式 UV 特征平面表示,将无结构的三维高斯模型编码到由 SMPL-X 参数化模型定义的结构化二维 UV 空间中。这种新颖表示不仅保留了原始三维高斯模型的表示能力,还引入了主体间的共享结构,从而支持扩散模型的生成式学习。为应对第二个挑战,我们提出一个部件感知变形模块,以实现鲁棒且准确的全身富有表现力的姿态控制。大量实验表明,我们的方法在虚拟化身生成方面实现了卓越性能,并支持富有表现力的全身姿态控制与编辑。