PoseGen: Learning to Generate 3D Human Pose Dataset with NeRF

This paper proposes an end-to-end framework for generating 3D human pose datasets using Neural Radiance Fields (NeRF). Public datasets generally have limited diversity in terms of human poses and camera viewpoints, largely due to the resource-intensive nature of collecting 3D human pose data. As a result, pose estimators trained on public datasets significantly underperform when applied to unseen out-of-distribution samples. Previous works proposed augmenting public datasets by generating 2D-3D pose pairs or rendering a large amount of random data. Such approaches either overlook image rendering or result in suboptimal datasets for pre-trained models. Here we propose PoseGen, which learns to generate a dataset (human 3D poses and images) with a feedback loss from a given pre-trained pose estimator. In contrast to prior art, our generated data is optimized to improve the robustness of the pre-trained model. The objective of PoseGen is to learn a distribution of data that maximizes the prediction error of a given pre-trained model. As the learned data distribution contains OOD samples of the pre-trained model, sampling data from such a distribution for further fine-tuning a pre-trained model improves the generalizability of the model. This is the first work that proposes NeRFs for 3D human data generation. NeRFs are data-driven and do not require 3D scans of humans. Therefore, using NeRF for data generation is a new direction for convenient user-specific data generation. Our extensive experiments show that the proposed PoseGen improves two baseline models (SPIN and HybrIK) on four datasets with an average 6% relative improvement.

翻译：摘要：本文提出一种利用神经辐射场（NeRF）生成三维人体姿态数据集的端到端框架。由于采集三维人体姿态数据需要大量资源，公开数据集在人体姿态和相机视角方面的多样性通常有限。因此，在公开数据集上训练的姿态估计器应用于未知的分布外样本时表现显著下降。先前的工作通过生成2D-3D姿态对或渲染大量随机数据来增强公开数据集，这些方法要么忽略图像渲染，要么为预训练模型生成次优数据集。本文提出PoseGen，该方法通过给定预训练姿态估计器的反馈损失来学习生成数据集（人体3D姿态与图像）。与现有技术不同，我们生成的数据经过优化以提升预训练模型的鲁棒性。PoseGen的目标是学习一种数据分布，使给定预训练模型的预测误差最大化。由于学习到的数据分布包含预训练模型的分布外样本，从该分布中采样数据对预训练模型进行进一步微调可提升模型的泛化能力。这是首个提出利用NeRF生成三维人体数据的工作。NeRF基于数据驱动，无需人体三维扫描。因此，使用NeRF生成数据为便捷的用户定制数据生成开辟了新方向。大量实验表明，所提出的PoseGen在四个数据集上使两个基线模型（SPIN与HybrIK）平均相对性能提升6%。

相关内容

MoDELS

关注 46

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日