Existing human datasets for avatar creation are typically limited to laboratory environments, wherein high-quality annotations (e.g., SMPL estimation from 3D scans or multi-view images) can be ideally provided. However, their annotating requirements are impractical for real-world images or videos, posing challenges toward real-world applications on current avatar creation methods. To this end, we propose the WildAvatar dataset, a web-scale in-the-wild human avatar creation dataset extracted from YouTube, with $10,000+$ different human subjects and scenes. WildAvatar is at least $10\times$ richer than previous datasets for 3D human avatar creation. We evaluate several state-of-the-art avatar creation methods on our dataset, highlighting the unexplored challenges in real-world applications on avatar creation. We also demonstrate the potential for generalizability of avatar creation methods, when provided with data at scale. We will publicly release our data source links and annotations, to push forward 3D human avatar creation and other related fields for real-world applications.
翻译:现有用于虚拟化身创建的人体数据集通常局限于实验室环境,其中可理想地提供高质量标注(例如基于三维扫描或多视角图像的SMPL参数估计)。然而,这类标注要求对于真实世界图像或视频而言并不现实,这给当前虚拟化身创建方法在实际应用中的推广带来了挑战。为此,我们提出了WildAvatar数据集——一个从YouTube提取的Web规模野外人体虚拟化身创建数据集,包含超过$10,000$个不同的人物主体与场景。WildAvatar在数据丰富度上至少是先前三维人体虚拟化身创建数据集的$10$倍以上。我们在该数据集上评估了多种前沿虚拟化身创建方法,揭示了当前方法在实际应用场景中尚未解决的挑战。同时,我们证明了当获得大规模数据时,虚拟化身创建方法具备可扩展的泛化潜力。我们将公开数据源链接及标注信息,以推动三维人体虚拟化身创建及相关领域在现实应用中的发展。