Data-Efficient Approach to Humanoid Control via Fine-Tuning a Pre-Trained GPT on Action Data

There are several challenges in developing a model for multi-tasking humanoid control. Reinforcement learning and imitation learning approaches are quite popular in this domain. However, there is a trade-off between the two. Reinforcement learning is not the best option for training a humanoid to perform multiple behaviors due to training time and model size, and imitation learning using kinematics data alone is not appropriate to realize the actual physics of the motion. Training models to perform multiple complex tasks take long training time due to high DoF and complexities of the movements. Although training models offline would be beneficial, another issue is the size of the dataset, usually being quite large to encapsulate multiple movements. There are few implementations of transformer-based models to control humanoid characters and predict their motion based on a large dataset of recorded/reference motion. In this paper, we train a GPT on a large dataset of noisy expert policy rollout observations from a humanoid motion dataset as a pre-trained model and fine tune that model on a smaller dataset of noisy expert policy rollout observations and actions to autoregressively generate physically plausible motion trajectories. We show that it is possible to train a GPT-based foundation model on a smaller dataset in shorter training time to control a humanoid in a realistic physics environment to perform human-like movements.

翻译：开发多任务人形机器人控制模型面临若干挑战。强化学习与模仿学习是该领域的主流方法，但二者存在固有权衡：由于训练时间与模型规模限制，强化学习不适用于训练人形机器人执行多类行为；而仅使用运动学数据的模仿学习难以实现动作的真实物理特性。由于高自由度与运动复杂性，训练模型执行多重复杂任务需要漫长训练周期。虽然离线训练模型具有优势，但数据集规模通常需足够庞大以涵盖多种运动模式，这构成另一挑战。目前基于Transformer架构的模型在人形角色控制及基于大规模记录/参考运动数据预测动作方面的应用尚属罕见。本文提出一种方法：首先在包含噪声专家策略推演观测数据的大规模人形运动数据集上预训练GPT模型，随后在包含噪声专家策略推演观测与动作的小规模数据集上对该模型进行微调，使其能够自回归生成物理合理的运动轨迹。实验表明，基于较小规模数据集、较短训练周期训练GPT基础模型，即可在真实物理环境中控制人形机器人执行类人运动。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日