Communication Strategy on Macro-and-Micro Traffic State in Cooperative Deep Reinforcement Learning for Regional Traffic Signal Control

Adaptive Traffic Signal Control (ATSC) has become a popular research topic in intelligent transportation systems. Regional Traffic Signal Control (RTSC) using the Multi-agent Deep Reinforcement Learning (MADRL) technique has become a promising approach for ATSC due to its ability to achieve the optimum trade-off between scalability and optimality. Most existing RTSC approaches partition a traffic network into several disjoint regions, followed by applying centralized reinforcement learning techniques to each region. However, the pursuit of cooperation among RTSC agents still remains an open issue and no communication strategy for RTSC agents has been investigated. In this paper, we propose communication strategies to capture the correlation of micro-traffic states among lanes and the correlation of macro-traffic states among intersections. We first justify the evolution equation of the RTSC process is Markovian via a system of store-and-forward queues. Next, based on the evolution equation, we propose two GAT-Aggregated (GA2) communication modules--GA2-Naive and GA2-Aug to extract both intra-region and inter-region correlations between macro and micro traffic states. While GA2-Naive only considers the movements at each intersection, GA2-Aug also considers the lane-changing behavior of vehicles. Two proposed communication modules are then aggregated into two existing novel RTSC frameworks--RegionLight and Regional-DRL. Experimental results demonstrate that both GA2-Naive and GA2-Aug effectively improve the performance of existing RTSC frameworks under both real and synthetic scenarios. Hyperparameter testing also reveals the robustness and potential of our communication modules in large-scale traffic networks.

翻译：自适应交通信号控制已成为智能交通系统中的热门研究课题。利用多智能体深度强化学习技术的区域交通信号控制因其能在可扩展性与最优性之间实现最佳权衡，成为自适应交通信号控制的一种前景广阔的方法。现有区域交通信号控制方法大多将交通网络划分为若干互不重叠的区域，随后对每个区域应用集中式强化学习技术。然而，区域交通信号控制智能体间的协同机制仍是一个开放性问题，目前尚未有针对区域交通信号控制智能体的通信策略研究。本文提出通过通信策略捕捉车道间微观交通状态与交叉口间宏观交通状态的相关性。我们首先通过存储转发队列系统论证了区域交通信号控制过程的演化方程具有马尔可夫性。基于该演化方程，我们提出两种GAT聚合通信模块——GA2-Naive与GA2-Aug，以提取宏观与微观交通状态在区域内及区域间的关联性。GA2-Naive仅考虑各交叉口的转向行为，而GA2-Aug同时考虑了车辆的变道行为。随后将两种通信模块分别集成至两个现有新型区域交通信号控制框架——RegionLight与Regional-DRL中。实验结果表明，在真实与合成场景下，GA2-Naive与GA2-Aug均能有效提升现有区域交通信号控制框架的性能。超参数测试也验证了我们的通信模块在大规模交通网络中的鲁棒性与潜力。