Quadrupedal robots exhibit a wide range of viable gaits, but generating specific footfall sequences often requires laborious expert tuning of numerous variables, such as touch-down and lift-off events and holonomic constraints for each leg. This paper presents a unified reinforcement learning framework for generating versatile quadrupedal gaits by leveraging the intrinsic symmetries and velocity-period relationship of dynamic legged systems. We propose a symmetry-guided reward function design that incorporates temporal, morphological, and time-reversal symmetries. By focusing on preserved symmetries and natural dynamics, our approach eliminates the need for predefined trajectories, enabling smooth transitions between diverse locomotion patterns such as trotting, bounding, half-bounding, and galloping. Implemented on the Unitree Go2 robot, our method demonstrates robust performance across a range of speeds in both simulations and hardware tests, significantly improving gait adaptability without extensive reward tuning or explicit foot placement control. This work provides insights into dynamic locomotion strategies and underscores the crucial role of symmetries in robotic gait design.
翻译:四足机器人展现出多种可行的步态模式,但生成特定的足部着地序列通常需要对大量变量进行繁琐的专家调优,例如每条腿的触地/离地事件以及完整约束条件。本文提出了一种统一的强化学习框架,通过利用动态腿式系统的内在对称性和速度-周期关系来生成通用的四足步态。我们提出了一种融合时间对称性、形态对称性和时间反演对称性的对称性引导奖励函数设计。通过聚焦于保持的对称性和自然动力学特性,我们的方法无需预定义轨迹即可实现不同运动模式(如对角小跑、双足腾跃、半腾跃和疾驰)间的平滑切换。在宇树Go2机器人上的实验表明,该方法在仿真和硬件测试中均能在不同速度范围内展现鲁棒性能,无需大量奖励函数调优或显式足部位置控制即可显著提升步态适应性。本研究为动态运动策略提供了新的见解,并强调了对称性在机器人步态设计中的关键作用。