Extensive research has been devoted to the field of multi-agent navigation. Recently, there has been remarkable progress attributed to the emergence of learning-based techniques with substantially elevated intelligence and realism. Nonetheless, prevailing learned models face limitations in terms of scalability and effectiveness, primarily due to their agent-centric nature, i.e., the learned neural policy is individually deployed on each agent. Inspired by the efficiency observed in real-world traffic networks, we present an environment-centric navigation policy. Our method learns a set of traffic rules to coordinate a vast group of unintelligent agents that possess only basic collision-avoidance capabilities. Our method segments the environment into distinct blocks and parameterizes the traffic rule using a Graph Recurrent Neural Network (GRNN) over the block network. Each GRNN node is trained to modulate the velocities of agents as they traverse through. Using either Imitation Learning (IL) or Reinforcement Learning (RL) schemes, we demonstrate the efficacy of our neural traffic rules in resolving agent congestion, closely resembling real-world traffic regulations. Our method handles up to $240$ agents at real-time and generalizes across diverse agent and environment configurations.
翻译:多智能体导航领域已投入大量研究。近年来,随着基于学习技术的出现,智能体导航在智能性和真实性上取得了显著进展。然而,现有学习模型在可扩展性和有效性方面仍面临局限,这主要源于其以智能体为中心的特性,即学习到的神经策略需独立部署于每个智能体。受现实交通网络效率的启发,我们提出一种以环境为中心的导航策略。该方法学习一组交通规则,以协调仅具备基础避碰能力的大量非智能体。我们将环境分割为离散区块,并通过区块网络上的图循环神经网络对交通规则进行参数化。每个GRNN节点被训练用于调节穿过该区块的智能体速度。通过模仿学习或强化学习框架,我们展示了神经交通规则在解决智能体拥堵问题上的有效性,其效果与现实交通规则高度相似。本方法可实时处理多达240个智能体,并能泛化至不同的智能体与环境配置。