Quadruped animals are capable of exhibiting a diverse range of locomotion gaits. While progress has been made in demonstrating such gaits on robots, current methods rely on motion priors, dynamics models, or other forms of extensive manual efforts. People can use natural language to describe dance moves. Could one use a formal language to specify quadruped gaits? To this end, we aim to enable easy gait specification and efficient policy learning. Leveraging Reward Machines~(RMs) for high-level gait specification over foot contacts, our approach is called RM-based Locomotion Learning~(RMLL), and supports adjusting gait frequency at execution time. Gait specification is enabled through the use of a few logical rules per gait (e.g., alternate between moving front feet and back feet) and does not require labor-intensive motion priors. Experimental results in simulation highlight the diversity of learned gaits (including two novel gaits), their energy consumption and stability across different terrains, and the superior sample-efficiency when compared to baselines. We also demonstrate these learned policies with a real quadruped robot. Video and supplementary materials: https://sites.google.com/view/rm-locomotion-learning/home
翻译:四足动物能够展现出丰富多样的步态。尽管在机器人上演示这些步态已取得进展,但现有方法依赖运动先验、动力学模型或其他形式的大量人工介入。人们可以用自然语言描述舞步。那么能否用形式语言来指定四足步态呢?为此,我们旨在实现便捷的步态指定与高效的策略学习。通过利用奖励机(Reward Machines,RM)对足部接触进行高层次步态指定,我们的方法称为基于奖励机的步态学习(RM-based Locomotion Learning,RMLL),并支持在执行时调整步态频率。步态指定通过每个步态使用少量逻辑规则(例如:交替移动前足与后足)实现,无需劳动密集型的运动先验。仿真实验结果表明,学习到的步态具有多样性(包含两种新型步态)、在不同地形上的能耗与稳定性,以及与基线相比更优的样本效率。我们还在真实四足机器人上展示了这些学习到的策略。视频及补充材料:https://sites.google.com/view/rm-locomotion-learning/home