Quadrupedal robots can learn versatile locomotion skills but remain vulnerable when one or more joints lose power. In contrast, dogs and cats can adopt limping gaits when injured, demonstrating their remarkable ability to adapt to physical conditions. Inspired by such adaptability, this paper presents Action Learner (AcL), a novel teacher-student reinforcement learning framework that enables quadrupeds to autonomously adapt their gait for stable walking under multiple joint faults. Unlike conventional teacher-student approaches that enforce strict imitation, AcL leverages teacher policies to generate style rewards, guiding the student policy without requiring precise replication. We train multiple teacher policies, each corresponding to a different fault condition, and subsequently distill them into a single student policy with an encoder-decoder architecture. While prior works primarily address single-joint faults, AcL enables quadrupeds to walk with up to four faulty joints across one or two legs, autonomously switching between different limping gaits when faults occur. We validate AcL on a real Go2 quadruped robot under single- and double-joint faults, demonstrating fault-tolerant, stable walking, smooth gait transitions between normal and lamb gaits, and robustness against external disturbances.
翻译:四足机器人能够学习多种运动技能,但在一个或多个关节失去动力时仍显脆弱。相比之下,狗和猫在受伤时能够采用跛行步态,展现了其适应身体条件的卓越能力。受此类适应性的启发,本文提出行为学习器(AcL)——一种新颖的师生强化学习框架,使四足机器人能够在多关节故障下自主调整步态以实现稳定行走。与传统强制严格模仿的师生方法不同,AcL利用教师策略生成风格奖励,从而在不要求精确复现的前提下指导学生策略。我们训练了多个教师策略,每个策略对应不同的故障条件,随后通过编码器-解码器架构将它们提炼为单一的学生策略。现有工作主要解决单关节故障,而AcL使四足机器人能够在单腿或双腿上最多四个故障关节的情况下行走,并在故障发生时自主切换不同的跛行步态。我们在真实Go2四足机器人上验证了AcL在单关节和双关节故障下的性能,展示了其容错稳定的行走能力、正常步态与跛行步态间的平滑过渡特性以及对外部干扰的鲁棒性。