We present CAJun, a novel hierarchical learning and control framework that enables legged robots to jump continuously with adaptive jumping distances. CAJun consists of a high-level centroidal policy and a low-level leg controller. In particular, we use reinforcement learning (RL) to train the centroidal policy, which specifies the gait timing, base velocity, and swing foot position for the leg controller. The leg controller optimizes motor commands for the swing and stance legs according to the gait timing to track the swing foot target and base velocity commands using optimal control. Additionally, we reformulate the stance leg optimizer in the leg controller to speed up policy training by an order of magnitude. Our system combines the versatility of learning with the robustness of optimal control. By combining RL with optimal control methods, our system achieves the versatility of learning while enjoys the robustness from control methods, making it easily transferable to real robots. We show that after 20 minutes of training on a single GPU, CAJun can achieve continuous, long jumps with adaptive distances on a Go1 robot with small sim-to-real gaps. Moreover, the robot can jump across gaps with a maximum width of 70cm, which is over 40% wider than existing methods.
翻译:我们提出CAJun,一种新型分层学习与控制框架,使四足机器人能够以自适应跳跃距离实现连续跳跃。CAJun由高层质心策略和低层腿部控制器组成。具体而言,我们采用强化学习(RL)训练质心策略,该策略为腿部控制器指定步态时序、基座速度和摆动足位置。腿部控制器根据步态时序优化摆动腿和支撑腿的电机指令,利用最优控制跟踪摆动足目标与基座速度指令。此外,我们重新设计了腿部控制器中的支撑腿优化器,将策略训练速度提升一个数量级。该系统融合了学习的通用性与最优控制的鲁棒性。通过将强化学习与最优控制方法相结合,CAJun在保持学习灵活性的同时,兼具控制方法的稳定性,易于迁移至真实机器人。实验表明,在单个GPU上训练20分钟后,CAJun即可在Go1机器人上实现自适应距离的连续长距离跳跃,且仿真到现实的差距极小。此外,机器人最大可跨越宽度达70cm的间隙,比现有方法提升超过40%。