This work introduces a model-free reinforcement learning framework that enables various modes of motion (quadruped, tripod, or biped) and diverse tasks for legged robot locomotion. We employ a motion-style reward based on a relaxed logarithmic barrier function as a soft constraint, to bias the learning process toward the desired motion style, such as gait, foot clearance, joint position, or body height. The predefined gait cycle is encoded in a flexible manner, facilitating gait adjustments throughout the learning process. Extensive experiments demonstrate that KAIST HOUND, a 45 kg robotic system, can achieve biped, tripod, and quadruped locomotion using the proposed framework; quadrupedal capabilities include traversing uneven terrain, galloping at 4.67 m/s, and overcoming obstacles up to 58 cm (67 cm for HOUND2); bipedal capabilities include running at 3.6 m/s, carrying a 7.5 kg object, and ascending stairs-all performed without exteroceptive input.
翻译:本文提出了一种无模型强化学习框架,能够实现多种运动模式(四足、三足或双足)及多样化的腿式机器人运动任务。我们采用基于松弛对数障碍函数的运动风格奖励作为软约束,以引导学习过程朝向期望的运动风格,例如步态、足端离地高度、关节位置或身体高度。预定义的步态周期以灵活方式进行编码,便于在整个学习过程中调整步态。大量实验表明,45公斤的KAIST HOUND机器人系统能够利用所提框架实现双足、三足和四足运动:四足能力包括穿越不平坦地形、以4.67米/秒的速度疾驰、跨越高达58厘米的障碍物(HOUND2型号为67厘米);双足能力包括以3.6米/秒的速度奔跑、搬运7.5公斤物体及攀爬楼梯——所有任务均在无外部感知输入的情况下完成。