Automated driving at unsignalized intersections is challenging due to complex multi-vehicle interactions and the need to balance safety and efficiency. Model Predictive Control (MPC) offers structured constraint handling through optimization but relies on hand-crafted rules that often produce overly conservative behavior. Deep Reinforcement Learning (RL) learns adaptive behaviors from experience but often struggles with safety assurance and generalization to unseen environments. In this study, we present an integrated MPC-RL framework to improve navigation performance in multi-agent scenarios. Experiments show that MPC-RL outperforms standalone MPC and end-to-end RL across three traffic-density levels. Collectively, MPC-RL reduces the collision rate by 21% and improves the success rate by 6.5% compared to pure MPC. We further evaluate zero-shot transfer to a highway merging scenario without retraining. Both MPC-based methods transfer substantially better than end-to-end PPO, which highlights the role of the MPC backbone in cross-scenario robustness. The framework also shows faster loss stabilization than end-to-end RL during training, which indicates a reduced learning burden. These results suggest that the integrated approach can improve the balance between safety performance and efficiency in multi-agent intersection scenarios, while the MPC component provides a strong foundation for generalization across driving environments. The implementation code is available open-source.
翻译:无信号交叉口的自动驾驶因多车复杂交互及安全与效率的平衡需求而具有挑战性。模型预测控制通过优化提供结构化约束处理,但依赖人工设计规则,常导致过度保守行为。深度强化学习从经验中学习自适应策略,但在安全保证及未知场景泛化方面存在困难。本研究提出集成MPC-RL框架以提升多智能体场景下的导航性能。实验表明,在三种交通密度等级中,MPC-RL均优于独立MPC及端到端强化学习。相较于纯MPC,MPC-RL将碰撞率降低21%,成功率提升6.5%。我们进一步评估了无需重新训练即可零样本迁移至高速公路汇入场景的能力。两种基于MPC的方法迁移效果显著优于端到端PPO,凸显了MPC主干在跨场景鲁棒性中的作用。该框架在训练中比端到端强化学习更快实现损失稳定,表明降低了学习负担。这些结果表明集成方法能改善多智能体交叉口场景中安全性能与效率的平衡,同时MPC组件为跨驾驶环境泛化提供了坚实基础。实现代码已开源。