Achieving agile and generalized legged locomotion across terrains requires tight integration of perception and control, especially under occlusions and sparse footholds. Existing methods have demonstrated agility on parkour courses but often rely on end-to-end sensorimotor models with limited generalization and interpretability. By contrast, methods targeting generalized locomotion typically exhibit limited agility and struggle with visual occlusions. We introduce AME-2, a unified reinforcement learning (RL) framework for agile and generalized locomotion that incorporates a novel attention-based map encoder in the control policy. This encoder extracts local and global mapping features and uses attention mechanisms to focus on salient regions, producing an interpretable and generalized embedding for RL-based control. We further propose a learning-based mapping pipeline that provides fast, uncertainty-aware terrain representations robust to noise and occlusions, serving as policy inputs. It uses neural networks to convert depth observations into local elevations with uncertainties, and fuses them with odometry. The pipeline also integrates with parallel simulation so that we can train controllers with online mapping, aiding sim-to-real transfer. We validate AME-2 with the proposed mapping pipeline on a quadruped and a biped robot, and the resulting controllers demonstrate strong agility and generalization to unseen terrains in simulation and in real-world experiments.
翻译:在多种地形上实现敏捷且通用的腿式运动,尤其是在存在遮挡和立足点稀疏的情况下,需要感知与控制的紧密集成。现有方法已在跑酷类课程中展现出敏捷性,但通常依赖于端到端的感知运动模型,其泛化能力和可解释性有限。相比之下,针对通用运动的方法通常表现出有限的敏捷性,并且在视觉遮挡下表现不佳。我们提出了AME-2,一个用于敏捷通用运动的统一强化学习框架,它在控制策略中引入了一种新颖的基于注意力的地图编码器。该编码器提取局部和全局的地图特征,并利用注意力机制聚焦于显著区域,从而为基于强化学习的控制生成一个可解释且泛化的嵌入表示。我们进一步提出了一种基于学习的建图流程,该流程能够提供快速、具备不确定性感知能力的地形表示,对噪声和遮挡具有鲁棒性,并作为策略的输入。该流程使用神经网络将深度观测转换为带有不确定性的局部高程,并将其与里程计信息融合。该流程还与并行仿真集成,使我们能够通过在线建图来训练控制器,有助于仿真到现实的迁移。我们在一个四足机器人和一个双足机器人上,结合所提出的建图流程对AME-2进行了验证,所得到的控制器在仿真和真实世界实验中均展现出对未见地形的强大敏捷性和泛化能力。