This letter presents a control framework that combines model-based optimal control and reinforcement learning (RL) to achieve versatile and robust legged locomotion. Our approach enhances the RL training process by incorporating on-demand reference motions generated through finite-horizon optimal control, covering a broad range of velocities and gaits. These reference motions serve as targets for the RL policy to imitate, resulting in the development of robust control policies that can be learned efficiently and reliably. Moreover, by considering whole-body dynamics, RL overcomes the inherent limitations of modelling simplifications. Through simulation and hardware experiments, we demonstrate the robustness and controllability of the RL training process within our framework. Furthermore, our method demonstrates the ability to generalize reference motions and handle more complex locomotion tasks that may pose challenges for the simplified model, leveraging the flexibility of RL.
翻译:本文提出了一种结合基于模型的最优控制与强化学习(RL)的控制框架,以实现通用且鲁棒的腿部运动。该方法通过引入基于有限时域最优控制生成的按需参考动作(覆盖广泛速度和步态范围),改进了RL训练过程。这些参考动作作为RL策略的模仿目标,从而能够高效且可靠地学习鲁棒的控制策略。此外,通过考虑全身动力学,RL克服了模型简化固有的局限性。通过仿真和硬件实验,我们验证了所提框架中RL训练过程的鲁棒性和可控性。进一步地,该方法展示了泛化参考动作的能力,并能处理简化模型可能难以应对的更复杂运动任务,充分发挥了RL的灵活性。