Unicorn: A Universal and Collaborative Reinforcement Learning Approach Towards Generalizable Network-Wide Traffic Signal Control

from arxiv, \c{opyright} 20XX IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Adaptive traffic signal control (ATSC) is crucial in reducing congestion, maximizing throughput, and improving mobility in rapidly growing urban areas. Recent advancements in parameter-sharing multi-agent reinforcement learning (MARL) have greatly enhanced the scalable and adaptive optimization of complex, dynamic flows in large-scale homogeneous networks. However, the inherent heterogeneity of real-world traffic networks, with their varied intersection topologies and interaction dynamics, poses substantial challenges to achieving scalable and effective ATSC across different traffic scenarios. To address these challenges, we present Unicorn, a universal and collaborative MARL framework designed for efficient and adaptable network-wide ATSC. Specifically, we first propose a unified approach to map the states and actions of intersections with varying topologies into a common structure based on traffic movements. Next, we design a Universal Traffic Representation (UTR) module with a decoder-only network for general feature extraction, enhancing the model's adaptability to diverse traffic scenarios. Additionally, we incorporate an Intersection Specifics Representation (ISR) module, designed to identify key latent vectors that represent the unique intersection's topology and traffic dynamics through variational inference techniques. To further refine these latent representations, we employ a contrastive learning approach in a self-supervised manner, which enables better differentiation of intersection-specific features. Moreover, we integrate the state-action dependencies of neighboring agents into policy optimization, which effectively captures dynamic agent interactions and facilitates efficient regional collaboration. [...]. The code is available at https://github.com/marmotlab/Unicorn

翻译：自适应交通信号控制（ATSC）对于缓解快速增长城市区域的交通拥堵、最大化通行能力及提升机动性至关重要。近期基于参数共享的多智能体强化学习（MARL）研究进展显著增强了大规模同质交通网络中复杂动态流量的可扩展自适应优化能力。然而，真实交通网络固有的异质性特征——包括多样化的交叉口拓扑结构与交互动态——为跨不同交通场景实现可扩展有效的ATSC带来了严峻挑战。为应对这些挑战，我们提出Unicorn，一种面向高效且可自适应全域ATSC的通用协作MARL框架。具体而言，我们首先提出基于交通流向的交叉口状态与动作统一映射方法，将不同拓扑结构的交叉口映射至通用表征空间；其次设计基于仅解码器网络的通用交通表征（UTR）模块用于通用特征提取，增强模型对多样化交通场景的适应性；同时引入交叉口特异表征（ISR）模块，通过变分推断技术识别表征交叉口独特拓扑结构与交通动态的关键隐变量；进一步采用自监督对比学习方法优化这些隐表征，实现交叉口特异特征的更好区分；此外将邻域智能体的状态-动作依赖关系融入策略优化过程，有效捕捉动态智能体交互并促进高效区域协作。[...]。代码开源于https://github.com/marmotlab/Unicorn