Reinforcement learning (RL) is gaining popularity as an effective approach for traffic signal control (TSC) and is increasingly applied in this domain. However, most existing RL methodologies are confined to a single-stage TSC framework, primarily focusing on selecting an appropriate traffic signal phase at fixed action intervals, leading to inflexible and less adaptable phase durations. To address such limitations, we introduce a novel two-stage TSC framework named DynamicLight. This framework initiates with a phase control strategy responsible for determining the optimal traffic phase, followed by a duration control strategy tasked with determining the corresponding phase duration. Experimental results show that DynamicLight outperforms state-of-the-art TSC models and exhibits exceptional model generalization capabilities. Additionally, the robustness and potential for real-world implementation of DynamicLight are further demonstrated and validated through various DynamicLight variants. The code is released at https://github.com/LiangZhang1996/DynamicLight.
翻译:强化学习(RL)作为一种有效的交通信号控制(TSC)方法正日益受到关注,并在此领域得到广泛应用。然而,现有的大多数强化学习方法局限于单阶段TSC框架,主要侧重于在固定动作间隔内选择合适的交通信号相位,导致相位持续时间缺乏灵活性且适应性不足。为解决上述局限性,我们提出了一种名为DynamicLight的新型两阶段TSC框架。该框架首先采用相位控制策略负责确定最优交通相位,随后通过时长控制策略确定相应的相位持续时间。实验结果表明,DynamicLight的性能优于现有最先进的TSC模型,并展现出卓越的模型泛化能力。此外,通过多种DynamicLight变体的验证,进一步证明了该框架的鲁棒性及其在实际应用中的潜力。相关代码已发布在https://github.com/LiangZhang1996/DynamicLight。