Existing human datasets for avatar creation are typically limited to laboratory environments, wherein high-quality annotations (e.g., SMPL estimation from 3D scans or multi-view images) can be ideally provided. However, their annotating requirements are impractical for real-world images or videos, posing challenges toward real-world applications on current avatar creation methods. To this end, we propose the WildAvatar dataset, a web-scale in-the-wild human avatar creation dataset extracted from YouTube, with $10,000+$ different human subjects and scenes. WildAvatar is at least $10\times$ richer than previous datasets for 3D human avatar creation. We evaluate several state-of-the-art avatar creation methods on our dataset, highlighting the unexplored challenges in real-world applications on avatar creation. We also demonstrate the potential for generalizability of avatar creation methods, when provided with data at scale. We publicly release our data source links and annotations, to push forward 3D human avatar creation and other related fields for real-world applications.
翻译:现有用于虚拟化身创建的人体数据集通常局限于实验室环境,其中可理想地提供高质量标注(例如基于三维扫描或多视角图像的SMPL估计)。然而,此类标注要求对于真实世界图像或视频而言并不现实,这为当前虚拟化身创建方法在现实应用中的推广带来了挑战。为此,我们提出WildAvatar数据集——一个从YouTube提取的网络规模野外人体虚拟化身创建数据集,包含超过$10,000$个不同的人物主体与场景。WildAvatar在数据丰富度上至少是先前三维人体虚拟化身创建数据集的$10$倍。我们在本数据集上评估了若干最先进的虚拟化身创建方法,揭示了该领域在现实应用中所面临的未解挑战。同时,我们证明了当获得大规模数据时,虚拟化身创建方法具备泛化潜力。我们公开了数据源链接及标注信息,以推动三维人体虚拟化身创建及相关领域在现实应用中的发展。