This study focuses on a crucial task in the field of autonomous driving, autonomous lane change. Autonomous lane change plays a pivotal role in improving traffic flow, alleviating driver burden, and reducing the risk of traffic accidents. However, due to the complexity and uncertainty of lane-change scenarios, the functionality of autonomous lane change still faces challenges. In this research, we conduct autonomous lane-change simulations using both Deep Reinforcement Learning (DRL) and Model Predictive Control (MPC). Specifically, we propose the Parameterized Soft Actor-Critic (PASAC) algorithm to train a DRL-based lane-change strategy to output both discrete lane-change decision and continuous longitudinal vehicle acceleration. We also use MPC for lane selection based on predictive car-following costs for different lanes. For the first time, we compare the performance of DRL and MPC in the context of lane-change decision. Simulation results indicate that, under the same reward/cost functions and traffic flow, both MPC and PASAC achieve a collision rate of 0\%. PASAC demonstrates comparable performance to MPC in terms of episodic rewards/costs and average vehicle speeds.
翻译:本研究聚焦于自动驾驶领域的关键任务——自主换道。自主换道在改善交通流、减轻驾驶员负担以及降低交通事故风险方面具有重要作用。然而,由于换道场景的复杂性和不确定性,自主换道功能仍面临挑战。在本研究中,我们分别采用深度强化学习(DRL)和模型预测控制(MPC)进行自主换道仿真。具体而言,我们提出参数化柔性演员-评论家(PASAC)算法来训练基于DRL的换道策略,输出离散的换道决策和连续的纵向车辆加速度。同时,基于不同车道的前车跟随成本预测,采用MPC进行车道选择。本文首次对比了DRL与MPC在换道决策任务中的表现。仿真结果表明,在相同的奖励/成本函数及交通流条件下,MPC和PASAC的碰撞率均达到0%。在回合制奖励/成本及平均车速方面,PASAC展现出与MPC相当的性能。