This paper explores the impact of relational state abstraction on sample efficiency and performance in collaborative Multi-Agent Reinforcement Learning. The proposed abstraction is based on spatial relationships in environments where direct communication between agents is not allowed, leveraging the ubiquity of spatial reasoning in real-world multi-agent scenarios. We introduce MARC (Multi-Agent Relational Critic), a simple yet effective critic architecture incorporating spatial relational inductive biases by transforming the state into a spatial graph and processing it through a relational graph neural network. The performance of MARC is evaluated across six collaborative tasks, including a novel environment with heterogeneous agents. We conduct a comprehensive empirical analysis, comparing MARC against state-of-the-art MARL baselines, demonstrating improvements in both sample efficiency and asymptotic performance, as well as its potential for generalization. Our findings suggest that a minimal integration of spatial relational inductive biases as abstraction can yield substantial benefits without requiring complex designs or task-specific engineering. This work provides insights into the potential of relational state abstraction to address sample efficiency, a key challenge in MARL, offering a promising direction for developing more efficient algorithms in spatially complex environments.
翻译:本文探讨了关系状态抽象对协作式多智能体强化学习样本效率与性能的影响。所提出的抽象方法基于智能体间不允许直接通信环境中的空间关系,利用了现实世界多智能体场景中空间推理的普遍性。我们提出了MARC(多智能体关系评论家),这是一种简洁而有效的评论家架构,通过将状态转换为空间图并利用关系图神经网络进行处理,从而融入了空间关系归纳偏置。MARC的性能在六项协作任务中进行了评估,包括一个包含异构智能体的新颖环境。我们开展了全面的实证分析,将MARC与最先进的多智能体强化学习基线方法进行比较,结果表明其在样本效率和渐近性能方面均有提升,并展现了泛化潜力。我们的研究结果表明,将空间关系归纳偏置作为抽象进行最小化整合,无需复杂设计或任务特定工程即可带来显著效益。这项工作揭示了关系状态抽象在解决多智能体强化学习关键挑战——样本效率方面的潜力,为在空间复杂环境中开发更高效的算法提供了有前景的研究方向。