PHUMA：基于物理的人形机器人运动数据集 (PHUMA: Physically-Grounded Humanoid Locomotion Dataset)

Motion imitation is a promising approach for humanoid locomotion, enabling agents to acquire humanlike behaviors. Existing methods typically rely on high-quality motion capture datasets such as AMASS, but these are scarce and expensive, limiting scalability and diversity. Recent studies attempt to scale data collection by converting large-scale internet videos, exemplified by Humanoid-X. However, they often introduce physical artifacts such as floating, penetration, and foot skating, which hinder stable imitation. In response, we introduce PHUMA, a Physically-grounded HUMAnoid locomotion dataset that leverages human video at scale, while addressing physical artifacts through careful data curation and physics-constrained retargeting. PHUMA enforces joint limits, ensures ground contact, and eliminates foot skating, producing motions that are both large-scale and physically reliable. We evaluated PHUMA in two sets of conditions: (i) imitation of unseen motion from self-recorded test videos and (ii) path following with pelvis-only guidance. In both cases, PHUMA-trained policies outperform Humanoid-X and AMASS, achieving significant gains in imitating diverse motions. The code is available at https://davian-robotics.github.io/PHUMA.

翻译：运动模仿是实现人形机器人运动的一种有效方法，使智能体能够习得类人行为。现有方法通常依赖高质量的运动捕捉数据集（如AMASS），但这些数据稀缺且昂贵，限制了可扩展性和多样性。近期研究尝试通过转换大规模互联网视频（以Humanoid-X为代表）来扩展数据收集规模，但常引入漂浮、穿透和足部滑动等物理伪影，阻碍了稳定模仿。为此，我们提出了PHUMA（基于物理的人形机器人运动数据集），该数据集利用大规模人类视频，同时通过精细的数据处理和物理约束的重定向技术解决物理伪影问题。PHUMA强制执行关节限制、确保地面接触并消除足部滑动，生成兼具大规模与物理可靠性的运动数据。我们在两种条件下评估了PHUMA：（i）对自录制测试视频中未见运动的模仿；（ii）仅通过骨盆引导的路径跟随。两种情况下，基于PHUMA训练的策略均优于Humanoid-X和AMASS，在模仿多样化运动方面取得显著提升。代码发布于https://davian-robotics.github.io/PHUMA。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日