Mixed Traffic Control and Coordination from Pixels

Traffic congestion is a persistent problem in our society. Existing methods for traffic control have proven futile in alleviating current congestion levels leading researchers to explore ideas with robot vehicles given the increased emergence of vehicles with different levels of autonomy on our roads. This gives rise to mixed traffic control, where robot vehicles regulate human-driven vehicles through reinforcement learning (RL). However, most existing studies use precise observations that involve global information, such as environment outflow, and local information, i.e., vehicle positions and velocities. Obtaining this information requires updating existing road infrastructure with vast sensor environments and communication to potentially unwilling human drivers. We consider image observations as the alternative for mixed traffic control via RL: 1) images are ubiquitous through satellite imagery, in-car camera systems, and traffic monitoring systems; 2) images do not require a complete re-imagination of the observation space from environment to environment; and 3) images only require communication to equipment. In this work, we show robot vehicles using image observations can achieve similar performance to using precise information on environments, including ring, figure eight, intersection, merge, and bottleneck. In certain scenarios, our approach even outperforms using precision observations, e.g., up to 26% increase in average vehicle velocity in the merge environment and a 6% increase in outflow in the bottleneck environment, despite only using local traffic information as opposed to global traffic information.

翻译：交通拥堵是社会持续面临的难题。现有交通控制方法在缓解当前拥堵程度方面收效甚微，这促使研究者探索利用自动驾驶车辆——随着道路上不同自动化水平车辆的日益增多——来开展研究。由此催生了混合交通控制：通过强化学习（RL）让自动驾驶车辆调节人类驾驶车辆的行为。然而，现有研究大多依赖包含全局信息（如环境车流量）和局部信息（即车辆位置和速度）的精确观测。获取这些信息需要更新现有道路基础设施，布置大量传感器环境，并与可能不愿配合的人类驾驶员进行通信。我们提出将图像观测作为基于RL的混合交通控制的替代方案：1）通过卫星图像、车载摄像系统和交通监控系统，图像具有普适性；2）图像无需针对不同环境完全重新设计观测空间；3）图像仅需与设备进行通信。本研究表明，使用图像观测的自动驾驶车辆在环形路、八字形路、交叉口、汇入路段和瓶颈路段等环境中，能达到与使用精确环境信息相当的调控性能。在某些场景中，我们的方法甚至优于使用精确观测——例如在汇入环境中平均车速提升高达26%，在瓶颈环境中车流量增加6%——尽管仅使用局部交通信息而非全局交通信息。