In this paper, we address the issue of increasing the performance of reinforcement learning (RL) solutions for autonomous racing cars when navigating under conditions where practical vehicle modelling errors (commonly known as \emph{model mismatches}) are present. To address this challenge, we propose a partial end-to-end algorithm that decouples the planning and control tasks. Within this framework, an RL agent generates a trajectory comprising a path and velocity, which is subsequently tracked using a pure pursuit steering controller and a proportional velocity controller, respectively. In contrast, many current learning-based (i.e., reinforcement and imitation learning) algorithms utilise an end-to-end approach whereby a deep neural network directly maps from sensor data to control commands. By leveraging the robustness of a classical controller, our partial end-to-end driving algorithm exhibits better robustness towards model mismatches than standard end-to-end algorithms.
翻译:本文针对自动驾驶赛车在存在实际车辆建模误差(通常称为\emph{模型失配})条件下导航时,如何提升强化学习解决方案性能的问题展开研究。为解决这一挑战,我们提出一种将规划与控制任务解耦的部分端到端算法。在该框架中,强化学习智能体生成包含路径与速度的轨迹,随后分别通过纯追踪转向控制器与比例速度控制器进行跟踪。相比之下,当前许多基于学习(即强化学习与模仿学习)的算法采用端到端方法,即通过深度神经网络直接将传感器数据映射为控制指令。通过利用经典控制器的鲁棒性,我们提出的部分端到端驾驶算法相比标准端到端算法展现出对模型失配更优的鲁棒性。