Model Predictive Control (MPC) is attracting tremendous attention in the autonomous driving task as a powerful control technique. The success of an MPC controller strongly depends on an accurate internal dynamics model. However, the static parameters, usually learned by system identification, often fail to adapt to both internal and external perturbations in real-world scenarios. In this paper, we firstly (1) reformulate the problem as a Partially Observed Markov Decision Process (POMDP) that absorbs the uncertainties into observations and maintains Markov property into hidden states; and (2) learn a recurrent policy continually adapting the parameters of the dynamics model via Recurrent Reinforcement Learning (RRL) for optimal and adaptive control; and (3) finally evaluate the proposed algorithm (referred as $\textit{MPC-RRL}$) in CARLA simulator and leading to robust behaviours under a wide range of perturbations.
翻译:模型预测控制(MPC)作为一种强大的控制技术,在自动驾驶任务中备受关注。MPC控制器的成功高度依赖于精确的内部动力学模型。然而,通常通过系统辨识学习到的静态参数,在真实场景中往往无法适应内部和外部扰动。本文首先(1)将问题重构为部分可观测马尔可夫决策过程(POMDP),将不确定性吸收到观测中,并在隐藏状态中保持马尔可夫性质;(2)通过递归强化学习(RRL)学习递归策略,持续调整动力学模型参数,以实现最优自适应控制;(3)最终在CARLA模拟器中评估所提算法(记为$\textit{MPC-RRL}$),并在广泛扰动条件下展现出鲁棒行为。