Reinforcement learning has been revolutionizing the traditional traffic signal control task, showing promising power to relieve congestion and improve efficiency. However, the existing methods lack effective learning mechanisms capable of absorbing dynamic information inherent to a specific scenario and universally applicable dynamic information across various scenarios. Moreover, within each specific scenario, they fail to fully capture the essential empirical experiences about how to coordinate between neighboring and target intersections, leading to sub-optimal system-wide outcomes. Viewing these issues, we propose DuaLight, which aims to leverage both the experiential information within a single scenario and the generalizable information across various scenarios for enhanced decision-making. Specifically, DuaLight introduces a scenario-specific experiential weight module with two learnable parts: Intersection-wise and Feature-wise, guiding how to adaptively utilize neighbors and input features for each scenario, thus providing a more fine-grained understanding of different intersections. Furthermore, we implement a scenario-shared Co-Train module to facilitate the learning of generalizable dynamics information across different scenarios. Empirical results on both real-world and synthetic scenarios show DuaLight achieves competitive performance across various metrics, offering a promising solution to alleviate traffic congestion, with 3-7\% improvements. The code is available under: https://github.com/lujiaming-12138/DuaLight.
翻译:强化学习正在革新传统的交通信号控制任务,展现出缓解拥堵、提升效率的巨大潜力。然而,现有方法缺乏有效学习机制,既无法吸收特定场景固有的动态信息,也无法捕捉跨场景普遍适用的动态信息。此外,在每个特定场景中,这些方法未能充分掌握关于如何协调相邻交叉口与目标交叉口的关键经验知识,导致系统整体性能次优。针对这些问题,我们提出DuaLight,旨在同时利用单一场景内的经验信息与跨场景的泛化信息来增强决策能力。具体而言,DuaLight引入了一个包含两个可学习部分的场景特定经验权重模块:交叉口维度与特征维度,指导如何针对每个场景自适应地利用相邻交叉口和输入特征,从而提供对不同交叉口的更细粒度理解。此外,我们实现了一个场景共享的协同训练模块,以促进跨场景可泛化动态信息的学习。在真实场景与合成场景上的实验结果表明,DuaLight在多种指标上均取得了竞争性能,为缓解交通拥堵提供了有前景的解决方案,性能提升达3-7%。代码已开源:https://github.com/lujiaming-12138/DuaLight。