This paper considers optimal traffic signal control in smart cities, which has been taken as a complex networked system control problem. Given the interacting dynamics among traffic lights and road networks, attaining controller adaptivity and scalability stands out as a primary challenge. Capturing the spatial-temporal correlation among traffic lights under the framework of Multi-Agent Reinforcement Learning (MARL) is a promising solution. Nevertheless, existing MARL algorithms ignore effective information aggregation which is fundamental for improving the learning capacity of decentralized agents. In this paper, we design a new decentralized control architecture with improved environmental observability to capture the spatial-temporal correlation. Specifically, we first develop a topology-aware information aggregation strategy to extract correlation-related information from unstructured data gathered in the road network. Particularly, we transfer the road network topology into a graph shift operator by forming a diffusion process on the topology, which subsequently facilitates the construction of graph signals. A diffusion convolution module is developed, forming a new MARL algorithm, which endows agents with the capabilities of graph learning. Extensive experiments based on both synthetic and real-world datasets verify that our proposal outperforms existing decentralized algorithms.
翻译:本文研究智慧城市中的最优交通信号控制问题,该问题已被视为一个复杂网络化系统控制问题。由于交通信号灯与路网之间存在相互作用的动力学特性,实现控制器的自适应性与可扩展性成为主要挑战。在多智能体强化学习(MARL)框架下捕捉交通信号灯之间的时空相关性是一种有前景的解决方案。然而,现有MARL算法忽略了有效信息聚合这一基础环节,而该环节对于提升分散式智能体的学习能力至关重要。本文设计了一种新型分散式控制架构,通过增强环境可观测性来捕捉时空相关性。具体而言,我们首先提出了一种拓扑感知的信息聚合策略,从路网中采集的非结构化数据中提取相关性信息。特别地,我们通过在拓扑结构上构建扩散过程,将路网拓扑转化为图移位算子,从而促进图信号的构建。我们开发了扩散卷积模块,形成一种新的MARL算法,赋予智能体图学习能力。基于合成数据集和真实数据集的广泛实验验证表明,我们的方案优于现有分散式算法。