The control of traffic signals is crucial for improving transportation efficiency. Recently, learning-based methods, especially Deep Reinforcement Learning (DRL), garnered substantial success in the quest for more efficient traffic signal control strategies. However, the design of rewards in DRL highly demands domain knowledge to converge to an effective policy, and the final policy also presents difficulties in terms of explainability. In this work, a new learning-based method for signal control in complex intersections is proposed. In our approach, we design a concept of phase urgency for each signal phase. During signal transitions, the traffic light control strategy selects the next phase to be activated based on the phase urgency. We then proposed to represent the urgency function as an explainable tree structure. The urgency function can calculate the phase urgency for a specific phase based on the current road conditions. Genetic programming is adopted to perform gradient-free optimization of the urgency function. We test our algorithm on multiple public traffic signal control datasets. The experimental results indicate that the tree-shaped urgency function evolved by genetic programming outperforms the baselines, including a state-of-the-art method in the transportation field and a well-known DRL-based method.
翻译:交通信号控制对于提升交通运输效率至关重要。近年来,基于学习的方法,尤其是深度强化学习,在寻求更高效的交通信号控制策略方面取得了显著成功。然而,深度强化学习中奖励函数的设计高度依赖领域知识才能收敛到有效策略,且最终策略在可解释性方面也存在困难。本文提出了一种适用于复杂交叉口的全新基于学习的信号控制方法。在该方法中,我们为每个信号相位设计了"相位紧迫度"概念。在信号切换过程中,交通灯控制策略基于相位紧迫度选择下一个要激活的相位。随后,我们提出将紧迫度函数表示为可解释的树形结构。该紧迫度函数能够根据当前道路状况计算特定相位的相位紧迫度。采用遗传编程对紧迫度函数进行无梯度优化。我们在多个公开交通信号控制数据集上测试了算法。实验结果表明,经过遗传编程进化的树形紧迫度函数优于多个基线方法,包括交通领域的最先进方法和基于深度强化学习的知名方法。