In this paper, we introduce a new class of parameterized controllers, drawing inspiration from Model Predictive Control (MPC). The controller resembles a Quadratic Programming (QP) solver of a linear MPC problem, with the parameters of the controller being trained via Deep Reinforcement Learning (DRL) rather than derived from system models. This approach addresses the limitations of common controllers with Multi-Layer Perceptron (MLP) or other general neural network architecture used in DRL, in terms of verifiability and performance guarantees, and the learned controllers possess verifiable properties like persistent feasibility and asymptotic stability akin to MPC. On the other hand, numerical examples illustrate that the proposed controller empirically matches MPC and MLP controllers in terms of control performance and has superior robustness against modeling uncertainty and noises. Furthermore, the proposed controller is significantly more computationally efficient compared to MPC and requires fewer parameters to learn than MLP controllers. Real-world experiments on vehicle drift maneuvering task demonstrate the potential of these controllers for robotics and other demanding control tasks.
翻译:本文提出了一类受模型预测控制启发的参数化控制器。该控制器类似于线性MPC问题的二次规划求解器,其参数通过深度强化学习而非系统模型推导获得。该方法解决了DRL中常用多层感知机或其他通用神经网络架构控制器在可验证性与性能保证方面的局限性,使学习到的控制器具备类似MPC的持久可行性、渐近稳定性等可验证属性。另一方面,数值算例表明,所提控制器在控制性能上实证达到与MPC和MLP控制器相当的水平,且对建模不确定性和噪声具有更强的鲁棒性。此外,与MPC相比,该控制器的计算效率显著提升,且参数学习需求少于MLP控制器。在车辆漂移机动任务上的真实世界实验验证了此类控制器在机器人技术及其他严苛控制任务中的应用潜力。