Episodic Memory Temporal Consistency for Cooperative Multi-Agent Reinforcement Learning - 专知论文

会员服务 ·

0

Learning · 表示 · 强化学习 · 特化 · Integration ·

Episodic Memory Temporal Consistency for Cooperative Multi-Agent Reinforcement Learning

翻译：暂无翻译

Zicheng Zhao,Yu Lan,Chengzhengxu Li,Zhaohan Zhang,Xiaoming Liu

from arxiv, Under Review

Cooperative Multi-Agent Reinforcement Learning (MARL) frequently suffers from severe reward sparsity and exploration bottlenecks. While episodic memory mechanisms mitigate these issues by reusing high-return trajectories, they often trap agents in local optima due to unconstrained incentive distribution and semantic representation collapse. To address this, we propose Episodic Memory Temporal Consistency (EMTC), a framework that robustly constructs and selectively leverages historical experiences. EMTC introduces two synergistic components: (1) a Temporally Consistent Semantic Embedder that integrates contrastive learning with time-conditioned state reconstruction, preventing representation collapse and enabling precise memory retrieval; and (2) a Temporal Consistency Gating Mechanism that dynamically modulates episodic incentives based on temporal consistency error. This adaptive gate filters misleading signals from pseudo-successful trajectories, effectively mitigating Q-value overestimation. We provide theoretical guarantees, establishing a strict error bound that directly links the observable temporal consistency error to the underlying trajectory optimality and representation quality. Extensive evaluations on the SMAC and GRF benchmarks demonstrate that EMTC consistently outperforms state-of-the-art baselines. Notably, compared to the strongest episodic baseline, EMTC achieves absolute win-rate improvements of up to 24% in super-hard SMAC scenarios and an average improvement of 28% across GRF tasks.

翻译：暂无翻译

0

相关内容

Learning

多智能体强化学习中的稳健且高效的通信

多智能体强化学习中的稳健且高效的通信

专知会员服务

25+阅读 · 2025年11月17日

开放环境下的协作多智能体强化学习进展综述

开放环境下的协作多智能体强化学习进展综述

专知会员服务

34+阅读 · 2025年1月19日

【NeurIPS 2024】通过等变性提升多智能体强化学习中的样本效率和泛化能力

【NeurIPS 2024】通过等变性提升多智能体强化学习中的样本效率和泛化能力

专知会员服务

19+阅读 · 2024年10月6日

《注意力驱动的多智能体强化学习：利用专业知识强化任务决策》

《注意力驱动的多智能体强化学习：利用专业知识强化任务决策》

专知会员服务

57+阅读 · 2024年8月3日

《开放环境下协作多智能体强化学习研究进展综述》南大最新62页长综述

《开放环境下协作多智能体强化学习研究进展综述》南大最新62页长综述

专知会员服务

64+阅读 · 2024年2月2日

博弈论视角下的多智能体强化学习综述,129页pdf与76页Slides

博弈论视角下的多智能体强化学习综述,129页pdf与76页Slides

专知会员服务

140+阅读 · 2022年11月26日

「博弈论视角下多智能体强化学习」研究综述

「博弈论视角下多智能体强化学习」研究综述

专知会员服务

184+阅读 · 2022年4月30日

【斯坦福大学】Gradient Surgery for Multi-Task Learning

【斯坦福大学】Gradient Surgery for Multi-Task Learning

专知会员服务

47+阅读 · 2020年1月23日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

专知会员服务

26+阅读 · 2019年10月3日

博弈论视角下的多智能体强化学习综述,129页pdf与76页Slides

博弈论视角下的多智能体强化学习综述,129页pdf与76页Slides

专知

12+阅读 · 2022年11月26日

「博弈论视角下多智能体强化学习」研究综述

「博弈论视角下多智能体强化学习」研究综述

专知

60+阅读 · 2022年4月30日

【综述】多智能体强化学习算法理论研究

【综述】多智能体强化学习算法理论研究

深度强化学习实验室

16+阅读 · 2020年9月9日

多智能体强化学习（MARL）近年研究概览

多智能体强化学习（MARL）近年研究概览

PaperWeekly

38+阅读 · 2020年3月15日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

李宏毅-201806-中文-Deep Reinforcement Learning精品课程分享

李宏毅-201806-中文-Deep Reinforcement Learning精品课程分享

深度学习与NLP

15+阅读 · 2018年6月20日

循环神经网络多模态深度模型联想记忆功能研究

国家自然科学基金

6+阅读 · 2017年12月31日

忆阻递归神经网络的多重稳定性理论研究

国家自然科学基金

0+阅读 · 2015年12月31日

一对多联想记忆中的细胞神经网络建模及参数获取方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于多样化查询的多标记主动学习研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于记忆的不变图像特征学习方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

学习与记忆的神经动力学研究

国家自然科学基金

1+阅读 · 2014年12月31日

面向人与Agent混合的多团队协作仿真训练方法研究

国家自然科学基金

19+阅读 · 2012年12月31日

不确定环境下强化学习和决策的神经机制

国家自然科学基金

11+阅读 · 2012年12月31日

基于群体智能的多Agent协作模型与适应性研究

国家自然科学基金

18+阅读 · 2009年12月31日

基于动态分层与自学习的多智能体自适应协作模型

国家自然科学基金

17+阅读 · 2008年12月31日

Learning Coordinated Preference for Multi-Objective Multi-Agent Reinforcement Learning

Learning Coordinated Preference for Multi-Objective Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 6月12日

Robust Instruction Compliance in Cooperative Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 6月10日

Merging model-based control with multi-agent reinforcement learning for multi-agent cooperative teaming strategies

Arxiv

0+阅读 · 6月4日

Dreaming Of Others: Latent Teammate Modeling In World Models For Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 5月29日

Interaction-Breaking Adversarial Learning Framework for Robust Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 5月29日

An Agent-Centric Dynamical Systems Perspective on Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 5月28日

Offline Multi-agent Reinforcement Learning via Sequential Score Decomposition

Arxiv

0+阅读 · 5月28日

LLM-ALSO: LLM-Driven Adaptive Learning-Signal Optimization for Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 5月28日

Scaling up Energy-Aware Multi-Agent Reinforcement Learning for Mission-Oriented Drone Networks with Individual Reward

Arxiv

0+阅读 · 5月24日

Coding for Distributed Multi-Agent Reinforcement Learning

Arxiv

32+阅读 · 2021年1月7日

VIP会员

文章信息

相关主题

最新内容

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

专知会员服务

3+阅读 · 今天8:18

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

专知会员服务

3+阅读 · 今天7:39

《通用大语言模型：无人机指挥与控制接口》最新40页

《通用大语言模型：无人机指挥与控制接口》最新40页

专知会员服务

8+阅读 · 今天7:33

《通过小型无人机系统将情报能力“作战化”》

《通过小型无人机系统将情报能力“作战化”》

专知会员服务

3+阅读 · 今天7:28

《神经安全型有人–无人协同：面向认知自适应作战能力的参考架构》

《神经安全型有人–无人协同：面向认知自适应作战能力的参考架构》

专知会员服务

5+阅读 · 今天7:14

《在指挥链中通过多准则决策分析传达指挥官意图：空战实验》

《在指挥链中通过多准则决策分析传达指挥官意图：空战实验》

专知会员服务

18+阅读 · 6月15日

消耗优势：美军的“精确规模化”概念

消耗优势：美军的“精确规模化”概念

专知会员服务

7+阅读 · 6月15日

五角大楼的AI优先战略及其对现代战争的启示：来自与伊朗冲突的经验教训

五角大楼的AI优先战略及其对现代战争的启示：来自与伊朗冲突的经验教训

专知会员服务

8+阅读 · 6月15日

《网络空间兵棋推演：挑战、局限性与混合路径》报告

《网络空间兵棋推演：挑战、局限性与混合路径》报告

专知会员服务

8+阅读 · 6月15日

《离线语言支持系统：面向空战战术决策》

《离线语言支持系统：面向空战战术决策》

专知会员服务

8+阅读 · 6月15日

《以通信为中心的6G–LLM架构：面向可扩展的战术自主防御车辆网络》

《以通信为中心的6G–LLM架构：面向可扩展的战术自主防御车辆网络》

专知会员服务

6+阅读 · 6月15日

ICML 2026｜ECA：面向开放式图文生成的高效持续对齐

ICML 2026｜ECA：面向开放式图文生成的高效持续对齐

专知会员服务

6+阅读 · 6月14日

可信智能体AI综述：安全、鲁棒性、隐私与系统安全

可信智能体AI综述：安全、鲁棒性、隐私与系统安全

专知会员服务

6+阅读 · 6月14日

俄乌战场地面机器人如何改写战争规则

俄乌战场地面机器人如何改写战争规则

专知会员服务

9+阅读 · 6月14日

美国海军研究生院第23届年度采购研究研讨会与创新峰会：主题“加速作战能力”，附会议报告论文集1300页

美国海军研究生院第23届年度采购研究研讨会与创新峰会：主题“加速作战能力”，附会议报告论文集1300页

专知会员服务

13+阅读 · 6月14日

相关VIP内容

多智能体强化学习中的稳健且高效的通信

多智能体强化学习中的稳健且高效的通信

专知会员服务

25+阅读 · 2025年11月17日

开放环境下的协作多智能体强化学习进展综述

开放环境下的协作多智能体强化学习进展综述

专知会员服务

34+阅读 · 2025年1月19日

【NeurIPS 2024】通过等变性提升多智能体强化学习中的样本效率和泛化能力

【NeurIPS 2024】通过等变性提升多智能体强化学习中的样本效率和泛化能力

专知会员服务

19+阅读 · 2024年10月6日

《注意力驱动的多智能体强化学习：利用专业知识强化任务决策》

《注意力驱动的多智能体强化学习：利用专业知识强化任务决策》

专知会员服务

57+阅读 · 2024年8月3日

《开放环境下协作多智能体强化学习研究进展综述》南大最新62页长综述

《开放环境下协作多智能体强化学习研究进展综述》南大最新62页长综述

专知会员服务

64+阅读 · 2024年2月2日

博弈论视角下的多智能体强化学习综述,129页pdf与76页Slides

博弈论视角下的多智能体强化学习综述,129页pdf与76页Slides

专知会员服务

140+阅读 · 2022年11月26日

「博弈论视角下多智能体强化学习」研究综述

「博弈论视角下多智能体强化学习」研究综述

专知会员服务

184+阅读 · 2022年4月30日

【斯坦福大学】Gradient Surgery for Multi-Task Learning

【斯坦福大学】Gradient Surgery for Multi-Task Learning

专知会员服务

47+阅读 · 2020年1月23日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

专知会员服务

26+阅读 · 2019年10月3日

热门VIP内容

开通专知VIP会员享更多权益服务

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

《通过小型无人机系统将情报能力“作战化”》

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

《通用大语言模型：无人机指挥与控制接口》最新40页

相关资讯

博弈论视角下的多智能体强化学习综述,129页pdf与76页Slides

博弈论视角下的多智能体强化学习综述,129页pdf与76页Slides

专知

12+阅读 · 2022年11月26日

「博弈论视角下多智能体强化学习」研究综述

「博弈论视角下多智能体强化学习」研究综述

专知

60+阅读 · 2022年4月30日

【综述】多智能体强化学习算法理论研究

【综述】多智能体强化学习算法理论研究

深度强化学习实验室

16+阅读 · 2020年9月9日

多智能体强化学习（MARL）近年研究概览

多智能体强化学习（MARL）近年研究概览

PaperWeekly

38+阅读 · 2020年3月15日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

李宏毅-201806-中文-Deep Reinforcement Learning精品课程分享

李宏毅-201806-中文-Deep Reinforcement Learning精品课程分享

深度学习与NLP

15+阅读 · 2018年6月20日

相关论文

Learning Coordinated Preference for Multi-Objective Multi-Agent Reinforcement Learning

Learning Coordinated Preference for Multi-Objective Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 6月12日

Robust Instruction Compliance in Cooperative Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 6月10日

Merging model-based control with multi-agent reinforcement learning for multi-agent cooperative teaming strategies

Arxiv

0+阅读 · 6月4日

Dreaming Of Others: Latent Teammate Modeling In World Models For Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 5月29日

Interaction-Breaking Adversarial Learning Framework for Robust Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 5月29日

An Agent-Centric Dynamical Systems Perspective on Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 5月28日

Offline Multi-agent Reinforcement Learning via Sequential Score Decomposition

Arxiv

0+阅读 · 5月28日

LLM-ALSO: LLM-Driven Adaptive Learning-Signal Optimization for Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 5月28日

Scaling up Energy-Aware Multi-Agent Reinforcement Learning for Mission-Oriented Drone Networks with Individual Reward

Arxiv

0+阅读 · 5月24日

Coding for Distributed Multi-Agent Reinforcement Learning

Arxiv

32+阅读 · 2021年1月7日

相关基金

循环神经网络多模态深度模型联想记忆功能研究

国家自然科学基金

6+阅读 · 2017年12月31日

忆阻递归神经网络的多重稳定性理论研究

国家自然科学基金

0+阅读 · 2015年12月31日

一对多联想记忆中的细胞神经网络建模及参数获取方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于多样化查询的多标记主动学习研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于记忆的不变图像特征学习方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

学习与记忆的神经动力学研究

国家自然科学基金

1+阅读 · 2014年12月31日

面向人与Agent混合的多团队协作仿真训练方法研究

国家自然科学基金

19+阅读 · 2012年12月31日

不确定环境下强化学习和决策的神经机制

国家自然科学基金

11+阅读 · 2012年12月31日

基于群体智能的多Agent协作模型与适应性研究

国家自然科学基金

18+阅读 · 2009年12月31日

基于动态分层与自学习的多智能体自适应协作模型

国家自然科学基金

17+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员