In reinforcement learning-based (RL-based) traffic signal control (TSC), decisions on the signal timing are made based on the available information on vehicles at a road intersection. This forms the state representation for the RL environment which can either be high-dimensional containing several variables or a low-dimensional vector. Current studies suggest that using high dimensional state representations does not lead to improved performance on TSC. However, we argue, with experimental results, that the use of high dimensional state representations can, in fact, lead to improved TSC performance with improvements up to 17.9% of the average waiting time. This high-dimensional representation is obtainable using the cost-effective vehicle-to-infrastructure (V2I) communication, encouraging its adoption for TSC. Additionally, given the large size of the state, we identified the need to have computational efficient models and explored model compression via pruning.
翻译:在基于强化学习(RL-based)的交通信号控制(TSC)中,信号配时决策依据的是道路交叉口可获取的车辆信息。这构成了强化学习环境的状态表示,其可以是包含多个变量的高维表示,也可以是低维向量。现有研究表明,使用高维状态表示并不能提升TSC性能。然而,我们通过实验结果论证,使用高维状态表示实际上能够改善TSC性能,平均等待时间最多可降低17.9%。这种高维表示可通过经济高效的车辆到基础设施(V2I)通信获得,这有助于推动其在TSC中的应用。此外,考虑到状态空间规模庞大,我们认识到需要采用计算高效的模型,并探索了通过剪枝实现模型压缩的方法。