The Transformer model has demonstrated success across a wide range of domains, including in Multi-Agent Reinforcement Learning (MARL) where the Multi-Agent Transformer (MAT) has emerged as a leading algorithm in the field. However, a significant drawback of Transformer models is their quadratic computational complexity relative to input size, making them computationally expensive when scaling to larger inputs. This limitation restricts MAT's scalability in environments with many agents. Recently, State-Space Models (SSMs) have gained attention due to their computational efficiency, but their application in MARL remains unexplored. In this work, we investigate the use of Mamba, a recent SSM, in MARL and assess whether it can match the performance of MAT while providing significant improvements in efficiency. We introduce a modified version of MAT that incorporates standard and bi-directional Mamba blocks, as well as a novel "cross-attention" Mamba block. Extensive testing shows that our Multi-Agent Mamba (MAM) matches the performance of MAT across multiple standard multi-agent environments, while offering superior scalability to larger agent scenarios. This is significant for the MARL community, because it indicates that SSMs could replace Transformers without compromising performance, whilst also supporting more effective scaling to higher numbers of agents. Our project page is available at https://sites.google.com/view/multi-agent-mamba .
翻译:Transformer模型已在包括多智能体强化学习(MARL)在内的广泛领域中取得成功,其中多智能体Transformer(MAT)已成为该领域的领先算法。然而,Transformer模型的一个显著缺点是其相对于输入大小的二次计算复杂度,这使得在扩展到更大输入时计算成本高昂。这一限制制约了MAT在具有大量智能体的环境中的可扩展性。最近,状态空间模型(SSMs)因其计算效率而受到关注,但它们在MARL中的应用仍未被探索。在本工作中,我们研究了近期提出的SSM模型Mamba在MARL中的应用,并评估其是否能在提供显著效率提升的同时,达到与MAT相当的性能。我们引入了一个改进版本的MAT,它融合了标准及双向Mamba模块,以及一种新颖的“交叉注意力”Mamba模块。大量测试表明,我们的多智能体Mamba(MAM)在多个标准多智能体环境中与MAT性能相当,同时在扩展到更大规模智能体场景时表现出更优的可扩展性。这对MARL领域具有重要意义,因为它表明SSMs可以在不牺牲性能的前提下替代Transformers,同时支持更有效地扩展到更多数量的智能体。我们的项目页面位于 https://sites.google.com/view/multi-agent-mamba 。