Autonomous lane-change, a key feature of advanced driver-assistance systems, can enhance traffic efficiency and reduce the incidence of accidents. However, safe driving of autonomous vehicles remains challenging in complex environments. How to perform safe and appropriate lane change is a popular topic of research in the field of autonomous driving. Currently, few papers consider the safety of reinforcement learning in autonomous lane-change scenarios. We introduce safe hybrid-action reinforcement learning into discretionary lane change for the first time and propose Parameterized Soft Actor-Critic with PID Lagrangian (PASAC-PIDLag) algorithm. Furthermore, we conduct a comparative analysis of the Parameterized Soft Actor-Critic (PASAC), which is an unsafe version of PASAC-PIDLag. Both algorithms are employed to train the lane-change strategy of autonomous vehicles to output discrete lane-change decision and longitudinal vehicle acceleration. Our simulation results indicate that at a traffic density of 15 vehicles per kilometer (15 veh/km), the PASAC-PIDLag algorithm exhibits superior safety with a collision rate of 0%, outperforming the PASAC algorithm, which has a collision rate of 1%. The outcomes of the generalization assessments reveal that at low traffic density levels, both the PASAC-PIDLag and PASAC algorithms are proficient in attaining a 0% collision rate. Under conditions of high traffic flow density, the PASAC-PIDLag algorithm surpasses PASAC in terms of both safety and optimality.
翻译:自主变道作为高级驾驶辅助系统的关键功能,可提升交通效率并降低事故发生率。然而,在复杂环境中实现自动驾驶车辆的安全行驶仍具有挑战性。如何执行安全且恰当的变道操作是自动驾驶领域的热点研究课题。目前,鲜有文献考虑强化学习在自主变道场景中的安全性问题。本文首次将安全混合动作强化学习引入自主变道场景,提出带PID拉格朗日量的参数化柔性演员-评论家(PASAC-PIDLag)算法。此外,我们与参数化柔性演员-评论家(PASAC)算法进行对比分析——后者是PASAC-PIDLag的不安全版本。两种算法均用于训练自动驾驶车辆的变道策略,以输出离散变道决策与纵向车辆加速度。仿真结果表明,当交通密度为15辆/公里(15 veh/km)时,PASAC-PIDLag算法展现出卓越安全性,碰撞率为0%,优于碰撞率为1%的PASAC算法。泛化评估结果显示,在低交通密度条件下,两种算法均能实现0%碰撞率;而在高交通流密度场景下,PASAC-PIDLag算法在安全性和最优性两方面均优于PASAC算法。