Conventional trajectory planning approaches for autonomous racing are based on the sequential execution of prediction of the opposing vehicles and subsequent trajectory planning for the ego vehicle. If the opposing vehicles do not react to the ego vehicle, they can be predicted accurately. However, if there is interaction between the vehicles, the prediction loses its validity. For high interaction, instead of a planning approach that reacts exclusively to the fixed prediction, a trajectory planning approach is required that incorporates the interaction with the opposing vehicles. This paper demonstrates the limitations of a widely used conventional sampling-based approach within a highly interactive blocking scenario. We show that high success rates are achieved for less aggressive blocking behavior but that the collision rate increases with more significant interaction. We further propose a novel Reinforcement Learning (RL)-based trajectory planning approach for racing that explicitly exploits the interaction with the opposing vehicle without requiring a prediction. In contrast to the conventional approach, the RL-based approach achieves high success rates even for aggressive blocking behavior. Furthermore, we propose a novel safety layer (SL) that intervenes when the trajectory generated by the RL-based approach is infeasible. In that event, the SL generates a sub-optimal but feasible trajectory, avoiding termination of the scenario due to a not found valid solution.
翻译:传统的自动驾驶赛车轨迹规划方法基于对对手车辆的预测与自身车辆后续轨迹规划的串行执行。若对手车辆不对自身车辆作出反应,则可对其进行准确预测。然而,当车辆间存在交互时,预测的有效性将丧失。对于高度交互场景,需要一种能够融合与对手车辆交互的轨迹规划方法,而非仅对固定预测作出反应的规划方式。本文通过高度交互的阻挡场景,揭示了广泛使用的传统基于采样的规划方法的局限性。研究表明,该方法在对抗性较低的阻挡行为中能实现较高的成功率,但随着交互强度增加,碰撞率显著上升。我们进一步提出一种基于强化学习的新型赛车轨迹规划方法,该方法无需预测即可显式利用与对手车辆的交互。与传统方法相比,基于强化学习的方法即使在激进阻挡行为下仍能保持高成功率。此外,我们设计了一种新型安全层,当强化学习生成的轨迹不可行时,该安全层将介入并生成次优但可行的轨迹,避免因无法找到有效解而导致场景终止。