Attention mechanisms excel at learning sequential patterns by discriminating data based on relevance and importance. This provides state-of-the-art performance in advanced generative artificial intelligence models. This paper applies this concept of an attention mechanism for multi-agent safe control. We specifically consider the design of a neural network to control autonomous vehicles in a highway merging scenario. The environment is modeled as a Decentralized Partially Observable Markov Decision Process (Dec-POMDP). Within a QMIX framework, we include partial attention for each autonomous vehicle, thus allowing each ego vehicle to focus on the most relevant neighboring vehicles. Moreover, we propose a comprehensive reward signal that considers the global objectives of the environment (e.g., safety and vehicle flow) and the individual interests of each agent. Simulations are conducted in the Simulation of Urban Mobility (SUMO). The results show better performance compared to other driving algorithms in terms of safety, driving speed, and reward.
翻译:注意力机制通过根据数据的相关性和重要性对其进行区分,在序列模式学习方面表现出色。这为先进生成式人工智能模型提供了最先进的性能。本文将该注意力机制的概念应用于多智能体安全控制。我们特别考虑了在高速公路合流场景中设计用于控制自动驾驶汽车的神经网络。环境被建模为分散式部分可观测马尔可夫决策过程(Dec-POMDP)。在QMIX框架内,我们为每辆自动驾驶汽车引入部分注意力,从而允许每辆自车关注最相关的邻近车辆。此外,我们提出了一种综合考虑环境全局目标(例如安全性和车辆流量)以及每个智能体个体利益的综合奖励信号。在“城市移动性仿真”(SUMO)中进行了仿真实验。结果表明,在安全性、行驶速度和奖励方面,该方法优于其他驾驶算法。