The target of reducing travel time only is insufficient to support the development of future smart transportation systems. To align with the United Nations Sustainable Development Goals (UN-SDG), a further reduction of fuel and emissions, improvements of traffic safety, and the ease of infrastructure deployment and maintenance should also be considered. Different from existing work focusing on the optimization of the control in either traffic light signal (to improve the intersection throughput), or vehicle speed (to stabilize the traffic), this paper presents a multi-agent Deep Reinforcement Learning (DRL) system called CoTV, which Cooperatively controls both Traffic light signals and Connected Autonomous Vehicles (CAV). Therefore, our CoTV can well balance the achievement of the reduction of travel time, fuel, and emissions. In the meantime, CoTV can also be easy to deploy by cooperating with only one CAV that is the nearest to the traffic light controller on each incoming road. This enables more efficient coordination between traffic light controllers and CAV, thus leading to the convergence of training CoTV under the large-scale multi-agent scenario that is traditionally difficult to converge. We give the detailed system design of CoTV and demonstrate its effectiveness in a simulation study using SUMO under various grid maps and realistic urban scenarios with mixed-autonomy traffic.
翻译:仅以减少行驶时间为目标不足以支持未来智能交通系统的发展。为契合联合国可持续发展目标(UN-SDG),还须进一步降低燃油消耗与排放、提升交通安全,并优化基础设施的部署与维护便利性。不同于现有研究侧重于优化交通灯信号控制(以提高交叉口通行能力)或车辆速度控制(以稳定交通流),本文提出了一种名为CoTV的多智能体深度强化学习(DRL)系统,该系统协同控制交通灯信号与网联自动驾驶车辆(CAV)。因此,CoTV能够良好地平衡行驶时间缩短、燃油消耗降低与排放减少等目标。同时,CoTV仅需与每条入口车道上距离交通灯控制器最近的单辆CAV协同工作即可轻松部署。这实现了交通灯控制器与CAV之间更高效的协调,从而促使CoTV在传统上难以收敛的大规模多智能体场景下完成训练收敛。本文给出了CoTV的详细系统设计,并在SUMO仿真平台中基于多种网格地图与混合自主交通的真实城市场景验证了其有效性。