Parkour is a grand challenge for legged locomotion that requires robots to overcome various obstacles rapidly in complex environments. Existing methods can generate either diverse but blind locomotion skills or vision-based but specialized skills by using reference animal data or complex rewards. However, autonomous parkour requires robots to learn generalizable skills that are both vision-based and diverse to perceive and react to various scenarios. In this work, we propose a system for learning a single end-to-end vision-based parkour policy of diverse parkour skills using a simple reward without any reference motion data. We develop a reinforcement learning method inspired by direct collocation to generate parkour skills, including climbing over high obstacles, leaping over large gaps, crawling beneath low barriers, squeezing through thin slits, and running. We distill these skills into a single vision-based parkour policy and transfer it to a quadrupedal robot using its egocentric depth camera. We demonstrate that our system can empower two different low-cost robots to autonomously select and execute appropriate parkour skills to traverse challenging real-world environments.
翻译:跑酷是腿部运动领域的一项重大挑战,要求机器人在复杂环境中快速克服各种障碍物。现有方法可以生成多样但盲动的运动技能,或基于视觉但通过参考动物数据或复杂奖励函数专门化的技能。然而,自主跑酷要求机器人学习既基于视觉又具备泛化能力的多样化技能,以感知并应对各种场景。在本工作中,我们提出了一套系统,通过简单奖励且无需任何参考运动数据,学习单一的端到端基于视觉的跑酷策略,涵盖多种跑酷技能。我们开发了一种受直接配点法启发的强化学习方法,用于生成包括攀爬高障碍、跨越宽间隙、匍匐穿越低障碍、挤过窄缝隙以及奔跑等跑酷技能。我们将这些技能蒸馏为单一的基于视觉的跑酷策略,并将其迁移至使用自身深度相机(ego-centric depth camera)的仿生四足机器人。实验证明,我们的系统能够赋能两款低成本机器人,使其自主选择并执行适当的跑酷技能,以穿越具有挑战性的真实世界环境。