This work aims to push the limits of agility for bipedal robots by enabling a torque-controlled bipedal robot to perform robust and versatile dynamic jumps in the real world. We present a reinforcement learning framework for training a robot to accomplish a large variety of jumping tasks, such as jumping to different locations and directions. To improve performance on these challenging tasks, we develop a new policy structure that encodes the robot's long-term input/output (I/O) history while also providing direct access to a short-term I/O history. In order to train a versatile jumping policy, we utilize a multi-stage training scheme that includes different training stages for different objectives. After multi-stage training, the policy can be directly transferred to a real bipedal Cassie robot. Training on different tasks and exploring more diverse scenarios lead to highly robust policies that can exploit the diverse set of learned maneuvers to recover from perturbations or poor landings during real-world deployment. Such robustness in the proposed policy enables Cassie to succeed in completing a variety of challenging jump tasks in the real world, such as standing long jumps, jumping onto elevated platforms, and multi-axes jumps.
翻译:本工作旨在通过使力矩控制的双足机器人在现实世界中执行稳健且多能的动态跳跃,来突破双足机器人敏捷性的极限。我们提出了一种强化学习框架,用于训练机器人完成多种跳跃任务,例如跳向不同位置和方向。为提升这些挑战性任务的性能,我们开发了一种新的策略结构,该结构编码机器人的长期输入/输出(I/O)历史,同时提供对短期I/O历史的直接访问。为训练多能跳跃策略,我们采用了多阶段训练方案,针对不同目标设置不同训练阶段。经过多阶段训练后,该策略可直接迁移至真实的双足Cassie机器人。在不同任务上训练并探索更多样化场景,可产生高度稳健的策略,使其能利用学到的多样化机动动作,在现实部署中从扰动或不良着陆中恢复。这种稳健性使Cassie能够完成现实世界中各种具有挑战性的跳跃任务,例如立定跳远、跳上高台以及多轴跳跃。