In recent years, legged and wheeled-legged robots have gained prominence for tasks in environments predominantly created for humans across various domains. One significant challenge faced by many of these robots is their limited capability to navigate stairs, which hampers their functionality in multi-story environments. This study proposes a method aimed at addressing this limitation, employing reinforcement learning to develop a versatile controller applicable to a wide range of robots. In contrast to the conventional velocity-based controllers, our approach builds upon a position-based formulation of the RL task, which we show to be vital for stair climbing. Furthermore, the methodology leverages an asymmetric actor-critic structure, enabling the utilization of privileged information from simulated environments during training while eliminating the reliance on exteroceptive sensors during real-world deployment. Another key feature of the proposed approach is the incorporation of a boolean observation within the controller, enabling the activation or deactivation of a stair-climbing mode. We present our results on different quadrupeds and bipedal robots in simulation and showcase how our method allows the balancing robot Ascento to climb 15cm stairs in the real world, a task that was previously impossible for this robot.
翻译:近年来,足式与轮腿式机器人在各类人类主导环境中执行任务的应用日益广泛。这些机器人面临的主要挑战之一在于其有限的楼梯攀爬能力,这限制了它们在多层环境中的功能发挥。本研究提出了一种基于强化学习的方法,旨在开发适用于多种机器人的通用控制器以克服这一局限。与传统基于速度的控制器不同,我们的方法基于位置建模的强化学习任务范式,研究表明该范式对爬楼梯任务至关重要。此外,本方法采用非对称演员-评论家结构,在训练阶段可利用仿真环境中的特权信息,同时在实际部署中消除对外部传感器的依赖。另一个关键特征是控制器中引入布尔观测变量,可激活或停用爬楼梯模式。我们通过仿真实验展示了该方法在多种四足与双足机器人上的表现,并验证了它能让平衡机器人Ascento在真实环境中攀爬15厘米台阶——这是该机器人此前无法完成的任务。