Reinforcement learning (RL) for bipedal locomotion has recently demonstrated robust gaits over moderate terrains using only proprioceptive sensing. However, such blind controllers will fail in environments where robots must anticipate and adapt to local terrain, which requires visual perception. In this paper, we propose a fully-learned system that allows bipedal robots to react to local terrain while maintaining commanded travel speed and direction. Our approach first trains a controller in simulation using a heightmap expressed in the robot's local frame. Next, data is collected in simulation to train a heightmap predictor, whose input is the history of depth images and robot states. We demonstrate that with appropriate domain randomization, this approach allows for successful sim-to-real transfer with no explicit pose estimation and no fine-tuning using real-world data. To the best of our knowledge, this is the first example of sim-to-real learning for vision-based bipedal locomotion over challenging terrains.
翻译:基于强化学习(RL)的双足机器人运动控制最近已证明仅凭本体感知即可在中等复杂地形上实现稳健步态。然而,这类无视觉控制器在机器人必须预判并适应局部地形的环境中会失效,而这需要视觉感知能力。本文提出一种完全通过学习实现的系统,使双足机器人能在响应局部地形变化的同时保持指定的行进速度与方向。我们的方法首先在仿真环境中训练控制器,该控制器使用以机器人局部坐标系表示的高度图。随后在仿真中收集数据以训练高度图预测器,其输入为深度图像历史序列与机器人状态。实验表明,通过适当的域随机化,该方法无需显式位姿估计且无需使用真实世界数据进行微调,即可成功实现从仿真到现实的迁移。据我们所知,这是首个针对挑战性地形的视觉双足行走实现仿真到现实迁移学习的范例。