The Central Pattern Generator (CPG) is adept at generating rhythmic gait patterns characterized by consistent timing and adequate foot clearance. Yet, its open-loop configuration often compromises the system's control performance in response to environmental variations. On the other hand, Reinforcement Learning (RL), celebrated for its model-free properties, has gained significant traction in robotics due to its inherent adaptability and robustness. However, initiating traditional RL approaches from the ground up presents computational challenges and a heightened risk of converging to suboptimal local minima. In this paper, we propose an innovative quadruped locomotion framework, SYNLOCO, by synthesizing CPG and RL that can ingeniously integrate the strengths of both methods, enabling the development of a locomotion controller that is both stable and natural. Furthermore, we introduce a set of performance-driven reward metrics that augment the learning of locomotion control. To optimize the learning trajectory of SYNLOCO, a two-phased training strategy is presented. Our empirical evaluation, conducted on a Unitree GO1 robot under varied conditions--including distinct velocities, terrains, and payload capacities--showcases SYNLOCO's ability to produce consistent and clear-footed gaits across diverse scenarios. The developed controller exhibits resilience against substantial parameter variations, underscoring its potential for robust real-world applications.
翻译:中枢模式生成器(CPG)擅长生成具有稳定时序和充足足部离地间隙的节律性步态模式,但其开环结构常导致控制性能在环境变化时受到影响。另一方面,强化学习(RL)因其无模型特性而在机器人领域广受关注,其固有的适应性和鲁棒性使其优势显著,然而从零开始启动传统方法会面临计算挑战,并存在收敛至次优局部极小值的高风险。本文提出一种创新的四足运动框架SYNLOCO,通过融合CPG与RL,巧妙整合两者优势,从而开发出兼具稳定性与自然度的运动控制器。此外,我们引入一组性能驱动的奖励指标以增强运动控制学习。为优化SYNLOCO的学习轨迹,提出了一种两阶段训练策略。在Unitree GO1机器人上进行的实证评估——涵盖不同速度、地形及负载能力等多样化条件——表明SYNLOCO能够生成一致且足部离地清晰的步态。所开发的控制器在显著参数变化下仍保持鲁棒性,凸显其在实际应用中展现稳健性的潜力。