基于策略分层子空间的持续离线强化学习 (Hierarchical Subspaces of Policies for Continual Offline Reinforcement Learning)

In dynamic domains such as autonomous robotics and video game simulations, agents must continuously adapt to new tasks while retaining previously acquired skills. This ongoing process, known as Continual Reinforcement Learning, presents significant challenges, including the risk of forgetting past knowledge and the need for scalable solutions as the number of tasks increases. To address these issues, we introduce HIerarchical LOW-rank Subspaces of Policies (HILOW), a novel framework designed for continual learning in offline navigation settings. HILOW leverages hierarchical policy subspaces to enable flexible and efficient adaptation to new tasks while preserving existing knowledge. We demonstrate, through a careful experimental study, the effectiveness of our method in both classical MuJoCo maze environments and complex video game-like simulations, showcasing competitive performance and satisfying adaptability according to classical continual learning metrics, in particular regarding memory usage. Our work provides a promising framework for real-world applications where continuous learning from pre-collected data is essential.

翻译：在诸如自主机器人和视频游戏模拟等动态领域中，智能体必须持续适应新任务，同时保持先前习得的技能。这一持续过程被称为持续强化学习，它带来了重大挑战，包括遗忘过去知识的风险以及随着任务数量增加对可扩展解决方案的需求。为解决这些问题，我们提出了HIerarchical LOW-rank Subspaces of Policies (HILOW)，这是一个专为离线导航场景中持续学习设计的新颖框架。HILOW利用分层策略子空间，以实现对新任务的灵活高效适应，同时保留现有知识。我们通过细致的实验研究，在经典MuJoCo迷宫环境和复杂的类视频游戏模拟中，证明了我们方法的有效性，展示了其在经典持续学习指标（尤其是内存使用方面）上的竞争性性能和令人满意的适应性。我们的工作为现实世界应用提供了一个有前景的框架，在这些应用中，从预先收集的数据中进行持续学习至关重要。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

31+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日