Active traffic management incorporating autonomous vehicles (AVs) promises a future with diminished congestion and enhanced traffic flow. However, developing algorithms for real-world application requires addressing the challenges posed by continuous traffic flow and partial observability. To bridge this gap and advance the field of active traffic management towards greater decentralization, we introduce a novel asymmetric actor-critic model aimed at learning decentralized cooperative driving policies for autonomous vehicles using single-agent reinforcement learning. Our approach employs attention neural networks with masking to handle the dynamic nature of real-world traffic flow and partial observability. Through extensive evaluations against baseline controllers across various traffic scenarios, our model shows great potential for improving traffic flow at diverse bottleneck locations within the road system. Additionally, we explore the challenge associated with the conservative driving behaviors of autonomous vehicles that adhere strictly to traffic regulations. The experiment results illustrate that our proposed cooperative policy can mitigate potential traffic slowdowns without compromising safety.
翻译:主动交通管理结合自动驾驶汽车有望实现减少拥堵和优化交通流的未来。然而,面向实际应用的算法开发需要应对连续交通流和部分可观测性带来的挑战。为弥合这一差距并推动主动交通管理向更高程度的去中心化发展,我们提出了一种新型非对称演员-评论家模型,旨在利用单智能体强化学习为自动驾驶汽车学习去中心化协同驾驶策略。该方法采用带有掩码机制的注意力神经网络,以处理实际交通流的动态特性及部分可观测性。通过与多种交通场景下的基线控制器进行广泛评估,我们的模型在改善道路系统中不同瓶颈位置的交通流方面展现出巨大潜力。此外,我们探讨了严格遵守交通规则的自动驾驶汽车所表现出的保守驾驶行为带来的挑战。实验结果表明,本文提出的协同策略可在不牺牲安全性的前提下缓解潜在的交通减速问题。