Humanoid control is an important research challenge offering avenues for integration into human-centric infrastructures and enabling physics-driven humanoid animations. The daunting challenges in this field stem from the difficulty of optimizing in high-dimensional action spaces and the instability introduced by the bipedal morphology of humanoids. However, the extensive collection of human motion-captured data and the derived datasets of humanoid trajectories, such as MoCapAct, paves the way to tackle these challenges. In this context, we present Humanoid Generalist Autoencoding Planner (H-GAP), a state-action trajectory generative model trained on humanoid trajectories derived from human motion-captured data, capable of adeptly handling downstream control tasks with Model Predictive Control (MPC). For 56 degrees of freedom humanoid, we empirically demonstrate that H-GAP learns to represent and generate a wide range of motor behaviours. Further, without any learning from online interactions, it can also flexibly transfer these behaviors to solve novel downstream control tasks via planning. Notably, H-GAP excels established MPC baselines that have access to the ground truth dynamics model, and is superior or comparable to offline RL methods trained for individual tasks. Finally, we do a series of empirical studies on the scaling properties of H-GAP, showing the potential for performance gains via additional data but not computing. Code and videos are available at https://ycxuyingchen.github.io/hgap/.
翻译:摘要:人形机器人控制是一项重要的研究挑战,它为融入人类中心基础设施和实现物理驱动的人形动画提供了途径。该领域的严峻挑战源于高维动作空间优化的困难以及人形机器人双足形态带来的不稳定性。然而,大量的人类运动捕捉数据及其衍生的人形机器人轨迹数据集(如MoCapAct)为应对这些挑战铺平了道路。在此背景下,我们提出了人形通用自编码规划器(H-GAP),这是一种基于人类运动捕捉数据衍生的人形机器人轨迹训练的状态-动作轨迹生成模型,能够通过模型预测控制(MPC)熟练处理下游控制任务。针对具有56个自由度的人形机器人,实验证明H-GAP能够学习并表征多样化的运动行为。此外,无需在线交互学习,即可通过规划灵活迁移这些行为以解决新颖的下游控制任务。值得注意的是,H-GAP超越了能够访问真实动力学模型的已建立MPC基线,且优于或媲美针对单个任务训练的离线强化学习方法。最后,我们通过一系列关于H-GAG缩放特性的实证研究,表明通过增加数据(而非计算量)可提升性能的潜力。代码与视频见https://ycxuyingchen.github.io/hgap/。