RLPG: Reinforcement Learning Approach for Dynamic Intra-Platoon Gap Adaptation for Highway On-Ramp Merging

A platoon refers to a group of vehicles traveling together in very close proximity using automated driving technology. Owing to its immense capacity to improve fuel efficiency, driving safety, and driver comfort, platooning technology has garnered substantial attention from the autonomous vehicle research community. Although highly advantageous, recent research has uncovered that an excessively small intra-platoon gap can impede traffic flow during highway on-ramp merging. While existing control-based methods allow for adaptation of the intra-platoon gap to improve traffic flow, making an optimal control decision under the complex dynamics of traffic conditions remains a challenge due to the massive computational complexity. In this paper, we present the design, implementation, and evaluation of a novel reinforcement learning framework that adaptively adjusts the intra-platoon gap of an individual platoon member to maximize traffic flow in response to dynamically changing, complex traffic conditions for highway on-ramp merging. The framework's state space has been meticulously designed in consultation with the transportation literature to take into account critical traffic parameters that bear direct relevance to merging efficiency. An intra-platoon gap decision making method based on the deep deterministic policy gradient algorithm is created to incorporate the continuous action space to ensure precise and continuous adaptation of the intra-platoon gap. An extensive simulation study demonstrates the effectiveness of the reinforcement learning-based approach for significantly improving traffic flow in various highway on-ramp merging scenarios.

翻译：车队是指利用自动驾驶技术以极小间距协同行驶的车辆集群。由于其在提升燃油效率、行车安全及驾驶舒适性方面的巨大潜力，车队技术已引起自动驾驶研究领域的广泛关注。尽管优势显著，但最新研究发现，在高速公路匝道合流过程中，过小的车队内间隙会阻碍交通流。现有基于控制的方法虽能动态调整车队内间隙以改善交通流，但在复杂动态交通环境下，因计算复杂度极高，作出最优控制决策仍具挑战性。本文提出一种新型强化学习框架的设计、实现与评估——该框架可自适应调整单个车队成员的车间距，以应对动态复杂的匝道合流交通场景，最大化交通流效率。框架的状态空间经交通领域文献系统论证，纳入了与合流效率直接相关的关键交通参数。基于深度确定性策略梯度算法构建的车队内间隙决策方法，通过连续动作空间实现了间隙的精准连续自适应调整。大量仿真实验表明，该强化学习方法在多种高速公路匝道合流场景中能显著提升交通流效率。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

116+阅读 · 2020年4月5日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日