Cooperative Adaptive Cruise Control (CACC) represents a quintessential control strategy for orchestrating vehicular platoon movement within Connected and Automated Vehicle (CAV) systems, significantly enhancing traffic efficiency and reducing energy consumption. In recent years, the data-driven methods, such as reinforcement learning (RL), have been employed to address this task due to their significant advantages in terms of efficiency and flexibility. However, the delay issue, which often arises in real-world CACC systems, is rarely taken into account by current RL-based approaches. To tackle this problem, we propose a Delay-Aware Multi-Agent Reinforcement Learning (DAMARL) framework aimed at achieving safe and stable control for CACC. We model the entire decision-making process using a Multi-Agent Delay-Aware Markov Decision Process (MADA-MDP) and develop a centralized training with decentralized execution (CTDE) MARL framework for distributed control of CACC platoons. An attention mechanism-integrated policy network is introduced to enhance the performance of CAV communication and decision-making. Additionally, a velocity optimization model-based action filter is incorporated to further ensure the stability of the platoon. Experimental results across various delay conditions and platoon sizes demonstrate that our approach consistently outperforms baseline methods in terms of platoon safety, stability and overall performance.
翻译:协同自适应巡航控制(CACC)是连接自动驾驶汽车(CAV)系统中组织车辆队列运动的一种典型控制策略,能够显著提升交通效率并降低能耗。近年来,由于其高效性和灵活性的显著优势,强化学习(RL)等数据驱动方法已被应用于解决这一任务。然而,在实际CACC系统中常见的延迟问题,当前基于RL的方法却鲜有考虑。为解决这一问题,我们提出了一种延迟感知的多智能体强化学习(DAMARL)框架,旨在实现安全稳定的CACC控制。我们利用多智能体延迟感知马尔可夫决策过程(MADA-MDP)对整个决策过程进行建模,并开发了一种集中式训练与分布式执行(CTDE)的MARL框架,用于CACC队列的分布式控制。引入了一种集成注意力机制的策略网络,以增强CAV通信与决策的性能。此外,还嵌入了一个基于速度优化模型的行动过滤器,以进一步确保队列的稳定性。实验结果表明,在不同延迟条件和队列规模下,我们的方法在队列安全性、稳定性和整体性能方面始终优于基线方法。