Reinforcement Learning enables agents to learn optimal behaviors through interactions with environments. However, real-world environments are typically non-stationary, requiring agents to continuously adapt to new tasks and changing conditions. Although Continual Reinforcement Learning facilitates learning across multiple tasks, existing methods often suffer from catastrophic forgetting and inefficient knowledge utilization. To address these challenges, we propose Continual Knowledge Adaptation for Reinforcement Learning (CKA-RL), which enables the accumulation and effective utilization of historical knowledge. Specifically, we introduce a Continual Knowledge Adaptation strategy, which involves maintaining a task-specific knowledge vector pool and dynamically using historical knowledge to adapt the agent to new tasks. This process mitigates catastrophic forgetting and enables efficient knowledge transfer across tasks by preserving and adapting critical model parameters. Additionally, we propose an Adaptive Knowledge Merging mechanism that combines similar knowledge vectors to address scalability challenges, reducing memory requirements while ensuring the retention of essential knowledge. Experiments on three benchmarks demonstrate that the proposed CKA-RL outperforms state-of-the-art methods, achieving an improvement of 4.20% in overall performance and 8.02% in forward transfer. The source code is available at https://github.com/Fhujinwu/CKA-RL.
翻译:强化学习使智能体能够通过与环境的交互来学习最优行为。然而,现实世界环境通常是非平稳的,要求智能体持续适应新任务和变化的条件。尽管持续强化学习有助于跨多个任务进行学习,但现有方法常受灾难性遗忘和知识利用效率低下的困扰。为应对这些挑战,我们提出了持续知识适应强化学习(CKA-RL),该方法能够积累并有效利用历史知识。具体而言,我们引入了一种持续知识适应策略,该策略涉及维护一个任务特定的知识向量池,并动态利用历史知识使智能体适应新任务。此过程通过保留和调整关键模型参数,缓解了灾难性遗忘,并实现了跨任务的高效知识迁移。此外,我们提出了一种自适应知识融合机制,该机制通过合并相似的知识向量来应对可扩展性挑战,在确保保留核心知识的同时降低了内存需求。在三个基准测试上的实验表明,所提出的CKA-RL方法优于现有最先进方法,整体性能提升了4.20%,前向迁移性能提升了8.02%。源代码可在 https://github.com/Fhujinwu/CKA-RL 获取。