WildAvatar: Web-scale In-the-wild Video Dataset for 3D Avatar Creation

Existing human datasets for avatar creation are typically limited to laboratory environments, wherein high-quality annotations (e.g., SMPL estimation from 3D scans or multi-view images) can be ideally provided. However, their annotating requirements are impractical for real-world images or videos, posing challenges toward real-world applications on current avatar creation methods. To this end, we propose the WildAvatar dataset, a web-scale in-the-wild human avatar creation dataset extracted from YouTube, with $10,000+$ different human subjects and scenes. WildAvatar is at least $10\times$ richer than previous datasets for 3D human avatar creation. We evaluate several state-of-the-art avatar creation methods on our dataset, highlighting the unexplored challenges in real-world applications on avatar creation. We also demonstrate the potential for generalizability of avatar creation methods, when provided with data at scale. We publicly release our data source links and annotations, to push forward 3D human avatar creation and other related fields for real-world applications.

翻译：现有用于虚拟化身创建的人体数据集通常局限于实验室环境，其中可理想地提供高质量标注（例如基于三维扫描或多视角图像的SMPL估计）。然而，此类标注要求对于真实世界图像或视频而言并不现实，这为当前虚拟化身创建方法在现实应用中的推广带来了挑战。为此，我们提出WildAvatar数据集——一个从YouTube提取的网络规模野外人体虚拟化身创建数据集，包含超过$10,000$个不同的人物主体与场景。WildAvatar在数据丰富度上至少是先前三维人体虚拟化身创建数据集的$10$倍。我们在本数据集上评估了若干最先进的虚拟化身创建方法，揭示了该领域在现实应用中所面临的未解挑战。同时，我们证明了当获得大规模数据时，虚拟化身创建方法具备泛化潜力。我们公开了数据源链接及标注信息，以推动三维人体虚拟化身创建及相关领域在现实应用中的发展。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日