We consider the problem of multi-agent navigation and collision avoidance when observations are limited to the local neighborhood of each agent. We propose InforMARL, a novel architecture for multi-agent reinforcement learning (MARL) which uses local information intelligently to compute paths for all the agents in a decentralized manner. Specifically, InforMARL aggregates information about the local neighborhood of agents for both the actor and the critic using a graph neural network and can be used in conjunction with any standard MARL algorithm. We show that (1) in training, InforMARL has better sample efficiency and performance than baseline approaches, despite using less information, and (2) in testing, it scales well to environments with arbitrary numbers of agents and obstacles. We illustrate these results using four task environments, including one with predetermined goals for each agent, and one in which the agents collectively try to cover all goals.
翻译:我们考虑在观测仅限于每个智能体局部邻域时的多智能体导航与避障问题。我们提出InforMARL,一种用于多智能体强化学习的新颖架构,该架构智能地利用局部信息以去中心化方式为所有智能体计算路径。具体而言,InforMARL使用图神经网络为行动者和评论者聚合关于智能体局部邻域的信息,并可配合任何标准多智能体强化学习算法使用。我们表明:(1)在训练阶段,尽管使用更少的信息,InforMARL相比基准方法具有更好的样本效率和性能;(2)在测试阶段,它能够良好地扩展到具有任意数量智能体和障碍物的环境。我们通过四个任务环境展示了这些结果,包括每个智能体具有预设目标的任务,以及所有智能体共同覆盖所有目标的任务。