Spring-based actuators in legged locomotion provide energy-efficiency and improved performance, but increase the difficulty of controller design. While previous work has focused on extensive modeling and simulation to find optimal controllers for such systems, we propose to learn model-free controllers directly on the real robot. In our approach, gaits are first synthesized by central pattern generators (CPGs), whose parameters are optimized to quickly obtain an open-loop controller that achieves efficient locomotion. Then, to make this controller more robust and further improve the performance, we use reinforcement learning to close the loop, to learn corrective actions on top of the CPGs. We evaluate the proposed approach on the DLR elastic quadruped bert. Our results in learning trotting and pronking gaits show that exploitation of the spring actuator dynamics emerges naturally from optimizing for dynamic motions, yielding high-performing locomotion despite being model-free. The whole process takes no more than 1.5 hours on the real robot and results in natural-looking gaits.
翻译:在腿部运动中,基于弹簧的执行器能够提供能源效率并提升性能,但增加了控制器设计的难度。以往的研究侧重于通过大量建模和仿真来寻找这类系统的最优控制器,而我们提出直接在真实机器人上学习无模型控制器。我们的方法中,首先通过中枢模式发生器(CPGs)合成步态,优化其参数以快速获得实现高效运动的开环控制器。随后,为了使该控制器更具鲁棒性并进一步提升性能,我们采用强化学习来闭环控制器,在CPGs之上学习修正动作。我们在DLR弹性四足机器人bert上评估了所提出的方法。在学习小跑和弹跳步态的实验结果表明,对弹簧执行器动力学的利用会自然地从优化动态运动过程中涌现,尽管采用无模型方法,仍能产生高性能的运动。整个过程在真实机器人上不超过1.5小时,并生成自然形态的步态。