Differentiable model predictive control (MPC) offers a powerful framework for combining learning and control. However, its adoption has been limited by the inherently sequential nature of traditional optimization algorithms, which are challenging to parallelize on modern computing hardware like GPUs. In this work, we tackle this bottleneck by introducing a GPU-accelerated differentiable optimization tool for MPC. This solver leverages sequential quadratic programming and a custom preconditioned conjugate gradient (PCG) routine with tridiagonal preconditioning to exploit the problem's structure and enable efficient parallelization. We demonstrate substantial speedups over CPU- and GPU-based baselines, significantly improving upon state-of-the-art training times on benchmark reinforcement learning and imitation learning tasks. Finally, we showcase the method on the challenging task of reinforcement learning for driving at the limits of handling, where it enables robust drifting of a Toyota Supra through water puddles.
翻译:可微分模型预测控制(MPC)为学习与控制相结合提供了一个强大的框架。然而,其应用一直受到传统优化算法固有顺序性的限制,这些算法在现代计算硬件(如GPU)上难以并行化。本研究通过引入一种GPU加速的可微分优化工具来解决这一瓶颈。该求解器采用序列二次规划及带有三对角预处理的自定义预处理共轭梯度(PCG)算法,以利用问题结构并实现高效并行化。我们在基准强化学习和模仿学习任务上展示了相较于CPU与GPU基线的显著加速效果,大幅超越了最先进的训练时间。最后,我们将该方法应用于极具挑战性的极限操控驾驶强化学习任务,成功实现了丰田Supra在积水路面上的稳健漂移控制。