Multi-agent reinforcement learning (MARL) has shown promise for adaptive traffic signal control (ATSC), enabling multiple intersections to coordinate signal timings in real time. However, in large-scale settings, MARL faces constraints due to extensive data sharing and communication requirements. Federated learning (FL) mitigates these challenges by training shared models without directly exchanging raw data, yet traditional FL methods such as FedAvg struggle with highly heterogeneous intersections. Different intersections exhibit varying traffic patterns, demands, and road structures, so performing FedAvg across all agents is inefficient. To address this gap, we propose Hierarchical Federated Reinforcement Learning (HFRL) for ATSC. HFRL employs clustering-based or optimization-based techniques to dynamically group intersections and perform FedAvg independently within groups of intersections with similar characteristics, enabling more effective coordination and scalability than standard FedAvg.Our experiments on synthetic and real-world traffic networks demonstrate that HFRL consistently outperforms decentralized and standard federated RL approaches, and achieves competitive or superior performance compared to centralized RL as network scale and heterogeneity increase, particularly in real-world settings. The method also identifies suitable grouping patterns based on network structure or traffic demand, resulting in a more robust framework for distributed, heterogeneous systems.
翻译:多智能体强化学习(MARL)在自适应交通信号控制(ATSC)领域展现出潜力,能够实现多个交叉路口实时协调信号配时。然而,在大规模场景中,MARL因广泛的数据共享与通信需求而面临限制。联邦学习(FL)通过在不直接交换原始数据的情况下训练共享模型来缓解这些挑战,但传统FL方法(如FedAvg)难以应对高度异构的交叉路口。不同交叉路口呈现出各异的交通模式、需求与道路结构,因此对所有智能体执行FedAvg效率低下。为弥补这一不足,我们提出用于ATSC的分层联邦强化学习(HFRL)。HFRL采用基于聚类或基于优化的技术,动态地对交叉路口进行分组,并在具有相似特征的路口组内独立执行FedAvg,从而实现了比标准FedAvg更有效的协调与可扩展性。我们在合成与真实交通网络上的实验表明,HFRL始终优于分散式及标准联邦RL方法,并且随着网络规模与异构性的增加(尤其在真实场景中),其性能达到甚至超越了集中式RL。该方法还能根据网络结构或交通需求识别合适的分组模式,从而为分布式异构系统构建了更鲁棒的框架。