Continual Knowledge Adaptation for Reinforcement Learning - 专知论文

会员服务 ·

0

知识 · 强化学习 · 性能提升 · 灾难性遗忘 · 智能体 ·

Continual Knowledge Adaptation for Reinforcement Learning

翻译：持续知识适应强化学习

Jinwu Hu,Zihao Lian,Zhiquan Wen,Chenghao Li,Guohao Chen,Xutao Wen,Bin Xiao,Mingkui Tan

from arxiv, NeurIPS 2025

Reinforcement Learning enables agents to learn optimal behaviors through interactions with environments. However, real-world environments are typically non-stationary, requiring agents to continuously adapt to new tasks and changing conditions. Although Continual Reinforcement Learning facilitates learning across multiple tasks, existing methods often suffer from catastrophic forgetting and inefficient knowledge utilization. To address these challenges, we propose Continual Knowledge Adaptation for Reinforcement Learning (CKA-RL), which enables the accumulation and effective utilization of historical knowledge. Specifically, we introduce a Continual Knowledge Adaptation strategy, which involves maintaining a task-specific knowledge vector pool and dynamically using historical knowledge to adapt the agent to new tasks. This process mitigates catastrophic forgetting and enables efficient knowledge transfer across tasks by preserving and adapting critical model parameters. Additionally, we propose an Adaptive Knowledge Merging mechanism that combines similar knowledge vectors to address scalability challenges, reducing memory requirements while ensuring the retention of essential knowledge. Experiments on three benchmarks demonstrate that the proposed CKA-RL outperforms state-of-the-art methods, achieving an improvement of 4.20% in overall performance and 8.02% in forward transfer. The source code is available at https://github.com/Fhujinwu/CKA-RL.

翻译：强化学习使智能体能够通过与环境的交互来学习最优行为。然而，现实世界环境通常是非平稳的，要求智能体持续适应新任务和变化的条件。尽管持续强化学习有助于跨多个任务进行学习，但现有方法常受灾难性遗忘和知识利用效率低下的困扰。为应对这些挑战，我们提出了持续知识适应强化学习（CKA-RL），该方法能够积累并有效利用历史知识。具体而言，我们引入了一种持续知识适应策略，该策略涉及维护一个任务特定的知识向量池，并动态利用历史知识使智能体适应新任务。此过程通过保留和调整关键模型参数，缓解了灾难性遗忘，并实现了跨任务的高效知识迁移。此外，我们提出了一种自适应知识融合机制，该机制通过合并相似的知识向量来应对可扩展性挑战，在确保保留核心知识的同时降低了内存需求。在三个基准测试上的实验表明，所提出的CKA-RL方法优于现有最先进方法，整体性能提升了4.20%，前向迁移性能提升了8.02%。源代码可在 https://github.com/Fhujinwu/CKA-RL 获取。

0

相关内容

【ICML2025】通过在线世界模型规划的持续强化学习

【ICML2025】通过在线世界模型规划的持续强化学习

专知会员服务

20+阅读 · 2025年7月18日

强化学习如何因果化？看最新《因果强化学习》综述论文，39页pdf

强化学习如何因果化？看最新《因果强化学习》综述论文，39页pdf

专知会员服务

84+阅读 · 2023年2月7日

清华最新《持续学习》综述，32页pdf详述持续学习理论、方法与应用综述

清华最新《持续学习》综述，32页pdf详述持续学习理论、方法与应用综述

专知会员服务

93+阅读 · 2023年2月3日

「连续学习Continual learning, CL」最新2022研究综述

「连续学习Continual learning, CL」最新2022研究综述

专知会员服务

85+阅读 · 2022年6月26日

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

132+阅读 · 2020年5月14日

【Uber AI新论文】持续元学习，Learning to Continually Learn

【Uber AI新论文】持续元学习，Learning to Continually Learn

专知会员服务

37+阅读 · 2020年2月27日

【NeurIPS 2019-教程】强化学习:过去、现在和未来展望（Rinforcement Learning: Past, Present, and Future Perspectives），微软首席研究员Katja Hofmann

【NeurIPS 2019-教程】强化学习:过去、现在和未来展望（Rinforcement Learning: Past, Present, and Future Perspectives），微软首席研究员Katja Hofmann

专知会员服务

59+阅读 · 2019年12月9日

【DeepMind-Nando de Freitas】强化学习教程，102页ppt，Reinforcement Learning

【DeepMind-Nando de Freitas】强化学习教程，102页ppt，Reinforcement Learning

专知会员服务

84+阅读 · 2019年11月15日

【南洋理工大学课程】deep_reinforcement_learning（深度强化学习），109页ppt

【南洋理工大学课程】deep_reinforcement_learning（深度强化学习），109页ppt

专知会员服务

105+阅读 · 2019年11月2日

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

专知会员服务

26+阅读 · 2019年10月3日

【牛津大学博士论文】强化学习系统的数据高效部署，165页pdf

【牛津大学博士论文】强化学习系统的数据高效部署，165页pdf

专知

14+阅读 · 2022年10月15日

【牛津大学博士论文】深度强化学习的归纳偏差和泛化,168页pdf

【牛津大学博士论文】深度强化学习的归纳偏差和泛化,168页pdf

专知

10+阅读 · 2022年10月6日

【牛津大学博士论文】元强化学习的快速自适应，217页pdf

【牛津大学博士论文】元强化学习的快速自适应，217页pdf

专知

30+阅读 · 2022年9月19日

强化学习《奖励函数设计: Reward Shaping》详细解读

强化学习《奖励函数设计: Reward Shaping》详细解读

深度强化学习实验室

20+阅读 · 2020年9月1日

探索(Exploration)还是利用(Exploitation)？强化学习如何tradeoff？

探索(Exploration)还是利用(Exploitation)？强化学习如何tradeoff？

深度强化学习实验室

13+阅读 · 2020年8月23日

强化学习的两大话题之一，仍有极大探索空间

强化学习的两大话题之一，仍有极大探索空间

AI科技评论

22+阅读 · 2020年8月22日

圣经书||《强化学习导论(2nd)》原书、代码、习题答案、课程视频大全

圣经书||《强化学习导论(2nd)》原书、代码、习题答案、课程视频大全

专知

59+阅读 · 2020年3月5日

【Uber AI新论文】持续元学习，Learning to Continually Learn

【Uber AI新论文】持续元学习，Learning to Continually Learn

专知

19+阅读 · 2020年2月27日

【强化学习】强化学习/增强学习/再励学习介绍

【强化学习】强化学习/增强学习/再励学习介绍

产业智能官

10+阅读 · 2018年2月23日

【强化学习】强化学习+深度学习=人工智能

【强化学习】强化学习+深度学习=人工智能

产业智能官

55+阅读 · 2017年8月11日

适应性记忆的认知与神经机制：生存加工和死亡提醒的双视角

国家自然科学基金

0+阅读 · 2016年12月31日

针对大规模环境下复杂任务的策略搜索强化学习方法研究

国家自然科学基金

43+阅读 · 2015年12月31日

基于复杂图知识表示的终身强化学习研究

国家自然科学基金

40+阅读 · 2015年12月31日

采用多模态磁共振技术研究知觉学习干预成人弱视的神经环路可塑性机制

国家自然科学基金

0+阅读 · 2015年12月31日

视知觉学习中的脑功能网络变化及其与学习效果的关系

国家自然科学基金

0+阅读 · 2015年12月31日

基于逆向强化学习和人工智能的移动机器人自主学习方法研究

国家自然科学基金

12+阅读 · 2013年12月31日

不确定环境下强化学习和决策的神经机制

国家自然科学基金

11+阅读 · 2012年12月31日

强化学习关键技术及其在机器人行为学习中的应用

国家自然科学基金

23+阅读 · 2009年12月31日

基于多智能体强化学习的多机器人系统研究

国家自然科学基金

49+阅读 · 2009年12月31日

基于支持向量机的复杂连续系统强化学习控制研究

国家自然科学基金

12+阅读 · 2008年12月31日

Experiential Reinforcement Learning

Arxiv

0+阅读 · 2月15日

Learning to Continually Learn via Meta-learning Agentic Memory Designs

Arxiv

0+阅读 · 2月8日

Hierarchical Subspaces of Policies for Continual Offline Reinforcement Learning

Arxiv

0+阅读 · 2月5日

Contrastive Continual Learning for Model Adaptability in Internet of Things

Arxiv

0+阅读 · 2月4日

A Continual Offline Reinforcement Learning Benchmark for Navigation Tasks

Arxiv

0+阅读 · 1月30日

Memento 2: Learning by Stateful Reflective Memory

Arxiv

0+阅读 · 1月29日

Unsupervised Learning of Efficient Exploration: Pre-training Adaptive Policies via Self-Imposed Goals

Arxiv

0+阅读 · 1月27日

Task Aware Dreamer for Task Generalization in Reinforcement Learning

Arxiv

0+阅读 · 1月23日

Memory Retention Is Not Enough to Master Memory Tasks in Reinforcement Learning

Arxiv

0+阅读 · 1月21日

Reinforcement Learning with Multi-Step Lookahead Information Via Adaptive Batching

Arxiv

0+阅读 · 1月15日

VIP会员

文章信息

相关主题

灾难性遗忘

最新内容

《移动旅级战斗队转型中的支援单元指挥控制挑战》

《移动旅级战斗队转型中的支援单元指挥控制挑战》

专知会员服务

12+阅读 · 5月27日

ICML2026 | 重新思考顺序知识编辑中的正则化

ICML2026 | 重新思考顺序知识编辑中的正则化

专知会员服务

6+阅读 · 5月27日

《用于兵力发展选项优先排序的成本效益模型》

《用于兵力发展选项优先排序的成本效益模型》

专知会员服务

10+阅读 · 5月27日

可信智能体AI综述：安全、鲁棒性、隐私与系统安全

可信智能体AI综述：安全、鲁棒性、隐私与系统安全

专知会员服务

9+阅读 · 5月27日

美军战场新倡议——国防自主作战群（DAWG）：五角大楼的540亿美元自主作战豪赌

美军战场新倡议——国防自主作战群（DAWG）：五角大楼的540亿美元自主作战豪赌

专知会员服务

7+阅读 · 5月27日

ICML2026 | LAVL：离线目标条件强化学习中的潜在表示对齐

ICML2026 | LAVL：离线目标条件强化学习中的潜在表示对齐

专知会员服务

7+阅读 · 5月26日

AutoResearch AI综述：迈向AI驱动的科学发现自动化

AutoResearch AI综述：迈向AI驱动的科学发现自动化

专知会员服务

10+阅读 · 5月26日

《Palantir边缘人工智能》手册

《Palantir边缘人工智能》手册

专知会员服务

25+阅读 · 5月26日

人工智能与现代战争：2026年美以对伊打击如何重构杀伤链

人工智能与现代战争：2026年美以对伊打击如何重构杀伤链

专知会员服务

16+阅读 · 5月26日

《运用人工智能及其他经验：瑞典制定2045年后战役级多域作战探索性概念的实践》

《运用人工智能及其他经验：瑞典制定2045年后战役级多域作战探索性概念的实践》

专知会员服务

13+阅读 · 5月26日

多层次反无人机战略：改革政策、提升公众意识并纳入防空体系（万字长文）

多层次反无人机战略：改革政策、提升公众意识并纳入防空体系（万字长文）

专知会员服务

13+阅读 · 5月26日

《基于非声学传感器的贝叶斯搜索研究》总结报告

《基于非声学传感器的贝叶斯搜索研究》总结报告

专知会员服务

7+阅读 · 5月26日

美军“国防自主作战群”（DAWG）概念解析

美军“国防自主作战群”（DAWG）概念解析

专知会员服务

6+阅读 · 5月26日

“史诗怒火”行动中的无人机与反无人机作战

“史诗怒火”行动中的无人机与反无人机作战

专知会员服务

18+阅读 · 5月25日

《北约城市作战高级训练技术（UCATT）实况模拟标准2》176页报告

《北约城市作战高级训练技术（UCATT）实况模拟标准2》176页报告

专知会员服务

8+阅读 · 5月25日

相关VIP内容

【ICML2025】通过在线世界模型规划的持续强化学习

【ICML2025】通过在线世界模型规划的持续强化学习

专知会员服务

20+阅读 · 2025年7月18日

强化学习如何因果化？看最新《因果强化学习》综述论文，39页pdf

强化学习如何因果化？看最新《因果强化学习》综述论文，39页pdf

专知会员服务

84+阅读 · 2023年2月7日

清华最新《持续学习》综述，32页pdf详述持续学习理论、方法与应用综述

清华最新《持续学习》综述，32页pdf详述持续学习理论、方法与应用综述

专知会员服务

93+阅读 · 2023年2月3日

「连续学习Continual learning, CL」最新2022研究综述

「连续学习Continual learning, CL」最新2022研究综述

专知会员服务

85+阅读 · 2022年6月26日

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

132+阅读 · 2020年5月14日

【Uber AI新论文】持续元学习，Learning to Continually Learn

【Uber AI新论文】持续元学习，Learning to Continually Learn

专知会员服务

37+阅读 · 2020年2月27日

【NeurIPS 2019-教程】强化学习:过去、现在和未来展望（Rinforcement Learning: Past, Present, and Future Perspectives），微软首席研究员Katja Hofmann

【NeurIPS 2019-教程】强化学习:过去、现在和未来展望（Rinforcement Learning: Past, Present, and Future Perspectives），微软首席研究员Katja Hofmann

专知会员服务

59+阅读 · 2019年12月9日

【DeepMind-Nando de Freitas】强化学习教程，102页ppt，Reinforcement Learning

【DeepMind-Nando de Freitas】强化学习教程，102页ppt，Reinforcement Learning

专知会员服务

84+阅读 · 2019年11月15日

【南洋理工大学课程】deep_reinforcement_learning（深度强化学习），109页ppt

【南洋理工大学课程】deep_reinforcement_learning（深度强化学习），109页ppt

专知会员服务

105+阅读 · 2019年11月2日

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

专知会员服务

26+阅读 · 2019年10月3日

热门VIP内容

开通专知VIP会员享更多权益服务

ICML2026 | 重新思考顺序知识编辑中的正则化

可信智能体AI综述：安全、鲁棒性、隐私与系统安全

《移动旅级战斗队转型中的支援单元指挥控制挑战》

《用于兵力发展选项优先排序的成本效益模型》

相关资讯

【牛津大学博士论文】强化学习系统的数据高效部署，165页pdf

【牛津大学博士论文】强化学习系统的数据高效部署，165页pdf

专知

14+阅读 · 2022年10月15日

【牛津大学博士论文】深度强化学习的归纳偏差和泛化,168页pdf

【牛津大学博士论文】深度强化学习的归纳偏差和泛化,168页pdf

专知

10+阅读 · 2022年10月6日

【牛津大学博士论文】元强化学习的快速自适应，217页pdf

【牛津大学博士论文】元强化学习的快速自适应，217页pdf

专知

30+阅读 · 2022年9月19日

强化学习《奖励函数设计: Reward Shaping》详细解读

强化学习《奖励函数设计: Reward Shaping》详细解读

深度强化学习实验室

20+阅读 · 2020年9月1日

探索(Exploration)还是利用(Exploitation)？强化学习如何tradeoff？

探索(Exploration)还是利用(Exploitation)？强化学习如何tradeoff？

深度强化学习实验室

13+阅读 · 2020年8月23日

强化学习的两大话题之一，仍有极大探索空间

强化学习的两大话题之一，仍有极大探索空间

AI科技评论

22+阅读 · 2020年8月22日

圣经书||《强化学习导论(2nd)》原书、代码、习题答案、课程视频大全

圣经书||《强化学习导论(2nd)》原书、代码、习题答案、课程视频大全

专知

59+阅读 · 2020年3月5日

【Uber AI新论文】持续元学习，Learning to Continually Learn

【Uber AI新论文】持续元学习，Learning to Continually Learn

专知

19+阅读 · 2020年2月27日

【强化学习】强化学习/增强学习/再励学习介绍

【强化学习】强化学习/增强学习/再励学习介绍

产业智能官

10+阅读 · 2018年2月23日

【强化学习】强化学习+深度学习=人工智能

【强化学习】强化学习+深度学习=人工智能

产业智能官

55+阅读 · 2017年8月11日

相关论文

Experiential Reinforcement Learning

Arxiv

0+阅读 · 2月15日

Learning to Continually Learn via Meta-learning Agentic Memory Designs

Arxiv

0+阅读 · 2月8日

Hierarchical Subspaces of Policies for Continual Offline Reinforcement Learning

Arxiv

0+阅读 · 2月5日

Contrastive Continual Learning for Model Adaptability in Internet of Things

Arxiv

0+阅读 · 2月4日

A Continual Offline Reinforcement Learning Benchmark for Navigation Tasks

Arxiv

0+阅读 · 1月30日

Memento 2: Learning by Stateful Reflective Memory

Arxiv

0+阅读 · 1月29日

Unsupervised Learning of Efficient Exploration: Pre-training Adaptive Policies via Self-Imposed Goals

Arxiv

0+阅读 · 1月27日

Task Aware Dreamer for Task Generalization in Reinforcement Learning

Arxiv

0+阅读 · 1月23日

Memory Retention Is Not Enough to Master Memory Tasks in Reinforcement Learning

Arxiv

0+阅读 · 1月21日

Reinforcement Learning with Multi-Step Lookahead Information Via Adaptive Batching

Arxiv

0+阅读 · 1月15日

相关基金

适应性记忆的认知与神经机制：生存加工和死亡提醒的双视角

国家自然科学基金

0+阅读 · 2016年12月31日

针对大规模环境下复杂任务的策略搜索强化学习方法研究

国家自然科学基金

43+阅读 · 2015年12月31日

基于复杂图知识表示的终身强化学习研究

国家自然科学基金

40+阅读 · 2015年12月31日

采用多模态磁共振技术研究知觉学习干预成人弱视的神经环路可塑性机制

国家自然科学基金

0+阅读 · 2015年12月31日

视知觉学习中的脑功能网络变化及其与学习效果的关系

国家自然科学基金

0+阅读 · 2015年12月31日

基于逆向强化学习和人工智能的移动机器人自主学习方法研究

国家自然科学基金

12+阅读 · 2013年12月31日

不确定环境下强化学习和决策的神经机制

国家自然科学基金

11+阅读 · 2012年12月31日

强化学习关键技术及其在机器人行为学习中的应用

国家自然科学基金

23+阅读 · 2009年12月31日

基于多智能体强化学习的多机器人系统研究

国家自然科学基金

49+阅读 · 2009年12月31日

基于支持向量机的复杂连续系统强化学习控制研究

国家自然科学基金

12+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员