In the literature, actor-critic model predictive control (AC-MPC) integrates MPC with reinforcement learning to enable high-performance control of complex dynamical systems. However, its differentiable MPC layer requires repeatedly solving an optimization problem in both the forward and backward passes, leading to substantial training and inference latency. This paper tackles this bottleneck introducing a CUDA-accelerated variant that significantly reduces end-to-end execution time while preserving the control performance of the baseline formulation. Simulation results on an agile drone racing task show that our approach achieves state-of-the-art lap times and near-limit dynamic behaviour with markedly reduced training and inference time.
翻译:在现有文献中,演员-评论家模型预测控制(AC-MPC)将MPC与强化学习相结合,实现了对复杂动力学系统的高性能控制。然而,其可微分的MPC层在前向传播和反向传播中均需反复求解优化问题,导致训练和推理延迟显著增大。本文针对这一瓶颈,提出了一种基于CUDA加速的变体方法,在保持基准方法控制性能的前提下,大幅缩短了端到端执行时间。在敏捷无人机竞速任务的仿真实验中,我们的方法以显著降低的训练和推理时间,实现了最先进的单圈成绩和接近极限的动态行为。