This work aims to push the limits of agility for bipedal robots by enabling a torque-controlled bipedal robot to perform robust and versatile dynamic jumps in the real world. We present a multi-task reinforcement learning framework to train the robot to accomplish a large variety of jumping tasks, such as jumping to different locations and directions. To improve performance on these challenging tasks, we develop a new policy structure that encodes the robot's long-term input/output (I/O) history while also providing direct access to its short-term I/O history. In order to train a versatile multi-task policy, we utilize a multi-stage training scheme that includes different training stages for different objectives. After multi-stage training, the multi-task policy can be directly transferred to Cassie, a physical bipedal robot. Training on different tasks and exploring more diverse scenarios leads to highly robust policies that can exploit the diverse set of learned skills to recover from perturbations or poor landings during real-world deployment. Such robustness in the proposed multi-task policy enables Cassie to succeed in completing a variety of challenging jump tasks in the real world, such as standing long jumps, jumping onto elevated platforms, and multi-axis jumps.
翻译:本工作旨在提升双足机器人的敏捷性极限,使扭矩控制型双足机器人能够在真实环境中实现稳健且多功能的动态跳跃。我们提出一种多任务强化学习框架,训练机器人完成多种跳跃任务,例如跳跃至不同位置和方向。为提升这些挑战性任务的性能,我们开发了一种新的策略结构,该结构在编码机器人长期输入/输出(I/O)历史的同时,还能直接访问其短期I/O历史。为训练多功能的多任务策略,我们采用了一种包含不同目标训练阶段的多阶段训练方案。经过多阶段训练后,该多任务策略可直接迁移至物理双足机器人Cassie。在不同任务上训练并探索更多样化场景,可生成高度稳健的策略,使其能够利用学习到的多样化技能,在真实部署中从扰动或不良着陆中恢复。这种多任务策略的稳健性使Cassie能够成功完成真实世界中多种具有挑战性的跳跃任务,例如立定跳远、跳上高台以及多轴跳跃。