Recovering photorealistic and drivable full-body avatars is crucial for numerous applications, including virtual reality, 3D games, and tele-presence. Most methods, whether reconstruction or generation, require large numbers of human motion sequences and corresponding textured meshes. To easily learn a drivable avatar, a reasonable parametric body model with unified topology is paramount. However, existing human body datasets either have images or textured models and lack parametric models which fit clothes well. We propose a new parametric model SMPLX-Lite-D, which can fit detailed geometry of the scanned mesh while maintaining stable geometry in the face, hand and foot regions. We present SMPLX-Lite dataset, the most comprehensive clothing avatar dataset with multi-view RGB sequences, keypoints annotations, textured scanned meshes, and textured SMPLX-Lite-D models. With the SMPLX-Lite dataset, we train a conditional variational autoencoder model that takes human pose and facial keypoints as input, and generates a photorealistic drivable human avatar.
翻译:恢复真实感且可驱动的全身化身对于虚拟现实、3D游戏和远程呈现等众多应用至关重要。大多数方法,无论是重建还是生成,都需要大量人体运动序列及对应的带纹理网格。为了便捷地学习可驱动化身,一个具有统一拓扑结构的合理参数化人体模型至关重要。然而,现有人体数据集要么仅包含图像,要么仅包含带纹理模型,且缺乏能良好拟合衣物的参数化模型。我们提出了一种新的参数化模型SMPLX-Lite-D,它能够在保持面部、手部和足部区域几何稳定性的同时,拟合扫描网格的精细几何细节。我们提出了SMPLX-Lite数据集,这是目前最全面的着装化身数据集,包含多视角RGB序列、关键点标注、带纹理的扫描网格以及带纹理的SMPLX-Lite-D模型。利用SMPLX-Lite数据集,我们训练了一个条件变分自编码器模型,该模型以人体姿态和面部关键点作为输入,并生成具有真实感的可驱动人体化身。