Multi-Agent Reinforcement Learning with Selective State-Space Models

The Transformer model has demonstrated success across a wide range of domains, including in Multi-Agent Reinforcement Learning (MARL) where the Multi-Agent Transformer (MAT) has emerged as a leading algorithm in the field. However, a significant drawback of Transformer models is their quadratic computational complexity relative to input size, making them computationally expensive when scaling to larger inputs. This limitation restricts MAT's scalability in environments with many agents. Recently, State-Space Models (SSMs) have gained attention due to their computational efficiency, but their application in MARL remains unexplored. In this work, we investigate the use of Mamba, a recent SSM, in MARL and assess whether it can match the performance of MAT while providing significant improvements in efficiency. We introduce a modified version of MAT that incorporates standard and bi-directional Mamba blocks, as well as a novel "cross-attention" Mamba block. Extensive testing shows that our Multi-Agent Mamba (MAM) matches the performance of MAT across multiple standard multi-agent environments, while offering superior scalability to larger agent scenarios. This is significant for the MARL community, because it indicates that SSMs could replace Transformers without compromising performance, whilst also supporting more effective scaling to higher numbers of agents. Our project page is available at https://sites.google.com/view/multi-agent-mamba .

翻译：Transformer模型已在包括多智能体强化学习（MARL）在内的广泛领域中取得成功，其中多智能体Transformer（MAT）已成为该领域的领先算法。然而，Transformer模型的一个显著缺点是其相对于输入大小的二次计算复杂度，这使得在扩展到更大输入时计算成本高昂。这一限制制约了MAT在具有大量智能体的环境中的可扩展性。最近，状态空间模型（SSMs）因其计算效率而受到关注，但它们在MARL中的应用仍未被探索。在本工作中，我们研究了近期提出的SSM模型Mamba在MARL中的应用，并评估其是否能在提供显著效率提升的同时，达到与MAT相当的性能。我们引入了一个改进版本的MAT，它融合了标准及双向Mamba模块，以及一种新颖的“交叉注意力”Mamba模块。大量测试表明，我们的多智能体Mamba（MAM）在多个标准多智能体环境中与MAT性能相当，同时在扩展到更大规模智能体场景时表现出更优的可扩展性。这对MARL领域具有重要意义，因为它表明SSMs可以在不牺牲性能的前提下替代Transformers，同时支持更有效地扩展到更多数量的智能体。我们的项目页面位于 https://sites.google.com/view/multi-agent-mamba 。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日