Training reinforcement learning (RL) policies for legged locomotion often requires extensive environment interactions, which are costly and time-consuming. We propose Symmetry-Guided Memory Augmentation (SGMA), a framework that improves training efficiency by combining structured experience augmentation with memory-based context inference. Our method leverages robot and task symmetries to generate additional, physically consistent training experiences without requiring extra interactions. To avoid the pitfalls of naive augmentation, we extend these transformations to the policy's memory states, enabling the agent to retain task-relevant context and adapt its behavior accordingly. We evaluate the approach on quadruped and humanoid robots in simulation, as well as on a real quadruped platform. Across diverse locomotion tasks involving joint failures and payload variations, our method achieves efficient policy training while maintaining robust performance, demonstrating a practical route toward data-efficient RL for legged robots.
翻译:训练用于腿式运动的强化学习(RL)策略通常需要大量的环境交互,这既昂贵又耗时。我们提出对称性引导的记忆增强(SGMA)框架,该框架通过结合结构化经验增强与基于记忆的上下文推断来提高训练效率。我们的方法利用机器人和任务的对称性来生成额外的、物理一致的训练经验,而无需额外的交互。为避免朴素增强的缺陷,我们将这些变换扩展到策略的记忆状态,使智能体能够保留任务相关的上下文并相应地调整其行为。我们在仿真中的四足和人形机器人以及真实四足平台上评估了该方法。在涉及关节故障和负载变化的多种运动任务中,我们的方法实现了高效策略训练,同时保持了鲁棒性能,为腿式机器人的数据高效强化学习展示了一条实用途径。