With the growing employment of learning algorithms in robotic applications, research on reinforcement learning for bipedal locomotion has become a central topic for humanoid robotics. While recently published contributions achieve high success rates in locomotion tasks, scarce attention has been devoted to the development of methods that enable to handle hardware faults that may occur during the locomotion process. However, in real-world settings, environmental disturbances or sudden occurrences of hardware faults might yield severe consequences. To address these issues, this paper presents TOLEBI (A faulT-tOlerant Learning framEwork for Bipedal locomotIon) that handles faults on the robot during operation. Specifically, joint locking, power loss and external disturbances are injected in simulation to learn fault-tolerant locomotion strategies. In addition to transferring the learned policy to the real robot via sim-to-real transfer, an online joint status module incorporated. This module enables to classify joint conditions by referring to the actual observations at runtime under real-world conditions. The validation experiments conducted both in real-world and simulation with the humanoid robot TOCABI highlight the applicability of the proposed approach. To our knowledge, this manuscript provides the first learning-based fault-tolerant framework for bipedal locomotion, thereby fostering the development of efficient learning methods in this field.
翻译:随着学习算法在机器人应用中的日益普及,针对双足步态的强化学习研究已成为人形机器人领域的核心课题。尽管近期发表的研究成果在步态任务中实现了较高的成功率,但能够处理步态过程中可能发生的硬件故障的方法开发却鲜有关注。然而,在实际应用场景中,环境干扰或硬件故障的突发可能导致严重后果。为解决这些问题,本文提出了TOLEBI(一种面向双足步态的容错学习框架),该框架能够在机器人运行期间处理故障。具体而言,通过在仿真中注入关节锁定、动力丧失和外部干扰来学习容错步态策略。除通过仿真到现实的迁移将习得的策略迁移至真实机器人外,框架还集成了在线关节状态模块。该模块能够根据实际运行环境下的实时观测数据对关节状态进行分类。在人形机器人TOCABI上进行的仿真与真实世界验证实验,突显了所提方法的适用性。据我们所知,本文首次提出了基于学习的双足步态容错框架,从而推动了该领域高效学习方法的发展。