Achieving agile and generalized legged locomotion across terrains requires tight integration of perception and control, especially under occlusions and sparse footholds. Existing methods have demonstrated agility on parkour courses but often rely on end-to-end sensorimotor models with limited generalization and interpretability. By contrast, methods targeting generalized locomotion typically exhibit limited agility and struggle with visual occlusions. We introduce AME-2, a unified reinforcement learning (RL) framework for agile and generalized locomotion that incorporates a novel attention-based map encoder in the control policy. This encoder extracts local and global mapping features and uses attention mechanisms to focus on salient regions, producing an interpretable and generalized embedding for RL-based control. We further propose a learning-based mapping pipeline that provides fast, uncertainty-aware terrain representations robust to noise and occlusions, serving as policy inputs. It uses neural networks to convert depth observations into local elevations with uncertainties, and fuses them with odometry. The pipeline also integrates with parallel simulation so that we can train controllers with online mapping, aiding sim-to-real transfer. We validate AME-2 with the proposed mapping pipeline on a quadruped and a biped robot, and the resulting controllers demonstrate strong agility and generalization to unseen terrains in simulation and in real-world experiments.
翻译:实现跨地形敏捷且通用的腿部运动,需要感知与控制的紧密整合,尤其在遮挡和稀疏立足点条件下。现有方法已在跑酷动作中展示出敏捷性,但常依赖端到端感知运动模型,存在泛化能力有限及可解释性不足的问题。相比之下,针对通用运动的方法通常敏捷性有限,且难以应对视觉遮挡。我们提出AME-2——一个统一的强化学习框架,用于实现敏捷与通用腿部运动,其在控制策略中引入了新型基于注意力机制的地图编码器。该编码器提取局部与全局地图特征,并利用注意力机制聚焦显著区域,为基于强化学习的控制生成可解释且通用的嵌入表征。我们进一步提出基于学习的建图流程,该流程能快速提供具有不确定性感知的地形表征,对噪声与遮挡具有鲁棒性,并作为策略输入。该流程利用神经网络将深度观测转换为带有不确定性的局部高程值,并与里程计数据融合。该流程还与并行仿真集成,使我们能够通过在线建图训练控制器,助力仿真到现实的迁移。我们利用所提出的建图流程在四足机器人和双足机器人上验证了AME-2,所得控制器在仿真和真实世界实验中展现出强大的敏捷性以及对未知地形的泛化能力。