Vehicles today can drive themselves on highways and driverless robotaxis operate in major cities, with more sophisticated levels of autonomous driving expected to be available and become more common in the future. Yet, technically speaking, so-called "Level 5" (L5) operation, corresponding to full autonomy, has not been achieved. For that to happen, functions such as fully autonomous highway ramp entry must be available, and provide provably safe, and reliably robust behavior to enable full autonomy. We present a systematic study of a highway ramp function that controls the vehicles forward-moving actions to minimize collisions with the stream of highway traffic into which a merging (ego) vehicle enters. We take a game-theoretic multi-agent (MA) approach to this problem and study the use of controllers based on deep reinforcement learning (DRL). The virtual environment of the MA DRL uses self-play with simulated data where merging vehicles safely learn to control longitudinal position during a taper-type merge. The work presented in this paper extends existing work by studying the interaction of more than two vehicles (agents) and does so by systematically expanding the road scene with additional traffic and ego vehicles. While previous work on the two-vehicle setting established that collision-free controllers are theoretically impossible in fully decentralized, non-coordinated environments, we empirically show that controllers learned using our approach are nearly ideal when measured against idealized optimal controllers.
翻译:当前车辆已能在高速公路上自主行驶,无人驾驶出租车已在主要城市运营,未来更高级别的自动驾驶预计将普及。然而从技术层面而言,对应完全自主驾驶的"L5级别"尚未实现。要实现这一目标,必须开发如全自主高速公路匝道汇入等功能,并提供可验证的安全性与可靠鲁棒性。本文系统研究了一种高速公路匝道控制功能,该功能通过控制车辆的前向运动以最小化汇入车辆(自车)与高速公路车流间的碰撞风险。我们采用博弈论多智能体方法研究基于深度强化学习的控制器设计。多智能体DRL虚拟环境采用自博弈与仿真数据相结合的方式,使汇入车辆在锥形汇入区安全学习纵向位置控制。本文通过研究两个以上车辆(智能体)的交互机制拓展了现有工作,并系统性地通过增加交通流与自车数量扩展道路场景。尽管先前针对双车场景的研究已证明在完全去中心化非协作环境中理论上无法实现无碰撞控制器,但我们通过实验表明:采用本方法学习的控制器在与理想化最优控制器的对比评估中表现出近乎理想的性能。