FOREVER：基于遗忘曲线的记忆回放用于语言模型持续学习 (FOREVER: Forgetting Curve-Inspired Memory Replay for Language Model Continual Learning) - 专知论文

会员服务 ·

0

记忆回放 · 持续学习 · 语言模型 · 调度 · 灾难性遗忘 ·

FOREVER: Forgetting Curve-Inspired Memory Replay for Language Model Continual Learning

翻译：FOREVER：基于遗忘曲线的记忆回放用于语言模型持续学习

Yujie Feng,Hao Wang,Jian Li,Xu Chu,Zhaolu Kang,Yiran Liu,Yasha Wang,Philip S. Yu,Xiao-Ming Wu

Continual learning (CL) for large language models (LLMs) aims to enable sequential knowledge acquisition without catastrophic forgetting. Memory replay methods are widely used for their practicality and effectiveness, but most rely on fixed, step-based heuristics that often misalign with the model's actual learning progress, since identical training steps can result in varying degrees of parameter change. Motivated by recent findings that LLM forgetting mirrors the Ebbinghaus human forgetting curve, we propose FOREVER (FORgEtting curVe-inspired mEmory Replay), a novel CL framework that aligns replay schedules with a model-centric notion of time. FOREVER defines model time using the magnitude of optimizer updates, allowing forgetting curve-inspired replay intervals to align with the model's internal evolution rather than raw training steps. Building on this approach, FOREVER incorporates a forgetting curve-based replay scheduler to determine when to replay and an intensity-aware regularization mechanism to adaptively control how to replay. Extensive experiments on three CL benchmarks and models ranging from 0.6B to 13B parameters demonstrate that FOREVER consistently mitigates catastrophic forgetting.

翻译：语言模型持续学习旨在实现顺序知识获取而不发生灾难性遗忘。记忆回放方法因其实用性和有效性被广泛采用，但多数依赖固定的基于训练步数的启发式策略，这些策略常与模型实际学习进度失准，因为相同的训练步数可能导致不同程度的参数变化。受近期研究发现语言模型遗忘遵循艾宾浩斯人类遗忘曲线的启发，我们提出FOREVER（基于遗忘曲线的记忆回放），这是一种新颖的持续学习框架，其回放调度与以模型为中心的时间概念对齐。FOREVER通过优化器更新幅度定义模型时间，使得基于遗忘曲线的回放间隔能与模型内部演化而非原始训练步数保持同步。基于此方法，FOREVER整合了基于遗忘曲线的回放调度器（决定何时回放）和强度感知正则化机制（自适应控制如何回放）。在三个持续学习基准测试和参数规模从0.6B到13B的模型上进行的大量实验表明，FOREVER能持续有效缓解灾难性遗忘。

0

相关内容

记忆回放

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

专知会员服务

21+阅读 · 2025年8月9日

【CVPR2024】卷积提示"遇见了语言模型的持续学习

【CVPR2024】卷积提示"遇见了语言模型的持续学习

专知会员服务

18+阅读 · 2024年4月1日

深度学习遗忘如何克服？马里兰大学等最新《深度学习遗忘》全面综述，概述大模型和持续学习上的遗忘

深度学习遗忘如何克服？马里兰大学等最新《深度学习遗忘》全面综述，概述大模型和持续学习上的遗忘

专知会员服务

56+阅读 · 2023年7月22日

什么是Machine Unlearning?悉尼科大等最新《机器遗忘》综述，36页pdf详述其技术体系

什么是Machine Unlearning?悉尼科大等最新《机器遗忘》综述，36页pdf详述其技术体系

专知会员服务

66+阅读 · 2023年6月7日

图上持续学习怎么做？悉尼大学等最新《持续图学习:挑战、解决方案和机会》综述，附Slides

图上持续学习怎么做？悉尼大学等最新《持续图学习:挑战、解决方案和机会》综述，附Slides

专知会员服务

59+阅读 · 2023年5月5日

持续学习：研究综述

持续学习：研究综述

专知会员服务

83+阅读 · 2023年1月30日

【巴黎理工学院博士论文】持续学习：用重放过程解决深度神经网络中的灾难性遗忘

【巴黎理工学院博士论文】持续学习：用重放过程解决深度神经网络中的灾难性遗忘

专知会员服务

36+阅读 · 2022年5月8日

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

专知会员服务

26+阅读 · 2022年3月15日

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

132+阅读 · 2020年5月14日

【Uber AI新论文】持续元学习，Learning to Continually Learn

【Uber AI新论文】持续元学习，Learning to Continually Learn

专知会员服务

37+阅读 · 2020年2月27日

「知识增强预训练语言模型」最新研究综述

「知识增强预训练语言模型」最新研究综述

专知

18+阅读 · 2022年11月18日

【Uber AI新论文】持续元学习，Learning to Continually Learn

【Uber AI新论文】持续元学习，Learning to Continually Learn

专知

19+阅读 · 2020年2月27日

【Google AI新论文】REALM:检索增强语言模型预训练，QA的SOTA提升4-16%准确性

【Google AI新论文】REALM:检索增强语言模型预训练，QA的SOTA提升4-16%准确性

专知

12+阅读 · 2020年2月12日

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

专知

21+阅读 · 2019年11月14日

FewRel 2.0数据集：以近知远，以一知万，少次学习新挑战

FewRel 2.0数据集：以近知远，以一知万，少次学习新挑战

PaperWeekly

24+阅读 · 2019年11月6日

BAM！利用知识蒸馏和多任务学习构建的通用语言模型

BAM！利用知识蒸馏和多任务学习构建的通用语言模型

机器之心

15+阅读 · 2019年3月18日

DeepMind无监督表示学习重大突破：语音、图像、文本、强化学习全能冠军！

DeepMind无监督表示学习重大突破：语音、图像、文本、强化学习全能冠军！

新智元

12+阅读 · 2018年7月13日

【干货】基于注意力机制的神经匹配模型用于短文本检索

【干货】基于注意力机制的神经匹配模型用于短文本检索

专知

11+阅读 · 2018年1月11日

迁移学习在深度学习中的应用

迁移学习在深度学习中的应用

专知

24+阅读 · 2017年12月24日

什么是迁移学习？它都用在深度学习的哪些场景上？这篇文章替你讲清楚了

什么是迁移学习？它都用在深度学习的哪些场景上？这篇文章替你讲清楚了

AI100

16+阅读 · 2017年12月23日

循环神经网络多模态深度模型联想记忆功能研究

国家自然科学基金

6+阅读 · 2017年12月31日

针对大规模环境下复杂任务的策略搜索强化学习方法研究

国家自然科学基金

42+阅读 · 2015年12月31日

基于复杂图知识表示的终身强化学习研究

国家自然科学基金

39+阅读 · 2015年12月31日

面向人类工作记忆改善的脑电复杂网络信息反馈非线性计算模型研究

国家自然科学基金

0+阅读 · 2015年12月31日

面向交互式问答的省略恢复技术研究

国家自然科学基金

5+阅读 · 2015年12月31日

脑衰老过程中长链非编码RNA对学习记忆相关基因的调控功能及机制

国家自然科学基金

0+阅读 · 2015年12月31日

基于记忆的不变图像特征学习方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

强调与对比影响语篇理解的认知过程及其神经机制

国家自然科学基金

4+阅读 · 2015年12月31日

基于记忆学习与免疫系统的仿生控制研究

国家自然科学基金

7+阅读 · 2015年12月31日

学习与记忆的神经动力学研究

国家自然科学基金

1+阅读 · 2014年12月31日

Rethinking Benign Relearning: Syntax as the Hidden Driver of Unlearning Failures

Arxiv

0+阅读 · 2月3日

Putting a Face to Forgetting: Continual Learning meets Mechanistic Interpretability

Arxiv

0+阅读 · 1月29日

FIT: Defying Catastrophic Forgetting in Continual LLM Unlearning

Arxiv

0+阅读 · 1月29日

Mechanistic Analysis of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning

Arxiv

0+阅读 · 1月26日

FG-OrIU: Towards Better Forgetting via Feature-Gradient Orthogonality for Incremental Unlearning

Arxiv

0+阅读 · 1月20日

Unlearning in LLMs: Methods, Evaluation, and Open Challenges

Arxiv

0+阅读 · 1月19日

Forgetting-MarI: LLM Unlearning via Marginal Information Regularization

Arxiv

0+阅读 · 1月17日

Continual Learning of Achieving Forgetting-free and Positive Knowledge Transfer

Arxiv

0+阅读 · 1月9日

Forget Less by Learning Together through Concept Consolidation

Arxiv

0+阅读 · 1月5日

Memory Bank Compression for Continual Adaptation of Large Language Models

Arxiv

0+阅读 · 1月2日

VIP会员

文章信息

相关主题

灾难性遗忘

相关VIP内容

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

专知会员服务

21+阅读 · 2025年8月9日

【CVPR2024】卷积提示"遇见了语言模型的持续学习

【CVPR2024】卷积提示"遇见了语言模型的持续学习

专知会员服务

18+阅读 · 2024年4月1日

深度学习遗忘如何克服？马里兰大学等最新《深度学习遗忘》全面综述，概述大模型和持续学习上的遗忘

深度学习遗忘如何克服？马里兰大学等最新《深度学习遗忘》全面综述，概述大模型和持续学习上的遗忘

专知会员服务

56+阅读 · 2023年7月22日

什么是Machine Unlearning?悉尼科大等最新《机器遗忘》综述，36页pdf详述其技术体系

什么是Machine Unlearning?悉尼科大等最新《机器遗忘》综述，36页pdf详述其技术体系

专知会员服务

66+阅读 · 2023年6月7日

图上持续学习怎么做？悉尼大学等最新《持续图学习:挑战、解决方案和机会》综述，附Slides

图上持续学习怎么做？悉尼大学等最新《持续图学习:挑战、解决方案和机会》综述，附Slides

专知会员服务

59+阅读 · 2023年5月5日

持续学习：研究综述

持续学习：研究综述

专知会员服务

83+阅读 · 2023年1月30日

【巴黎理工学院博士论文】持续学习：用重放过程解决深度神经网络中的灾难性遗忘

【巴黎理工学院博士论文】持续学习：用重放过程解决深度神经网络中的灾难性遗忘

专知会员服务

36+阅读 · 2022年5月8日

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

专知会员服务

26+阅读 · 2022年3月15日

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

132+阅读 · 2020年5月14日

【Uber AI新论文】持续元学习，Learning to Continually Learn

【Uber AI新论文】持续元学习，Learning to Continually Learn

专知会员服务

37+阅读 · 2020年2月27日

热门VIP内容

开通专知VIP会员享更多权益服务

智能体记忆深度剖析：评价指标与系统局限性的分类体系及实证分析

《可信人工智能赋能系统的支柱》

【CMU博士论文】可靠轨迹预测的分层基石：数据、评估与方法

人工智能赋能边缘与自主系统：美陆军现代化进程聚焦威胁探测与战术边缘情报

相关资讯

「知识增强预训练语言模型」最新研究综述

「知识增强预训练语言模型」最新研究综述

专知

18+阅读 · 2022年11月18日

【Uber AI新论文】持续元学习，Learning to Continually Learn

【Uber AI新论文】持续元学习，Learning to Continually Learn

专知

19+阅读 · 2020年2月27日

【Google AI新论文】REALM:检索增强语言模型预训练，QA的SOTA提升4-16%准确性

【Google AI新论文】REALM:检索增强语言模型预训练，QA的SOTA提升4-16%准确性

专知

12+阅读 · 2020年2月12日

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

专知

21+阅读 · 2019年11月14日

FewRel 2.0数据集：以近知远，以一知万，少次学习新挑战

FewRel 2.0数据集：以近知远，以一知万，少次学习新挑战

PaperWeekly

24+阅读 · 2019年11月6日

BAM！利用知识蒸馏和多任务学习构建的通用语言模型

BAM！利用知识蒸馏和多任务学习构建的通用语言模型

机器之心

15+阅读 · 2019年3月18日

DeepMind无监督表示学习重大突破：语音、图像、文本、强化学习全能冠军！

DeepMind无监督表示学习重大突破：语音、图像、文本、强化学习全能冠军！

新智元

12+阅读 · 2018年7月13日

【干货】基于注意力机制的神经匹配模型用于短文本检索

【干货】基于注意力机制的神经匹配模型用于短文本检索

专知

11+阅读 · 2018年1月11日

迁移学习在深度学习中的应用

迁移学习在深度学习中的应用

专知

24+阅读 · 2017年12月24日

什么是迁移学习？它都用在深度学习的哪些场景上？这篇文章替你讲清楚了

什么是迁移学习？它都用在深度学习的哪些场景上？这篇文章替你讲清楚了

AI100

16+阅读 · 2017年12月23日

相关论文

Rethinking Benign Relearning: Syntax as the Hidden Driver of Unlearning Failures

Arxiv

0+阅读 · 2月3日

Putting a Face to Forgetting: Continual Learning meets Mechanistic Interpretability

Arxiv

0+阅读 · 1月29日

FIT: Defying Catastrophic Forgetting in Continual LLM Unlearning

Arxiv

0+阅读 · 1月29日

Mechanistic Analysis of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning

Arxiv

0+阅读 · 1月26日

FG-OrIU: Towards Better Forgetting via Feature-Gradient Orthogonality for Incremental Unlearning

Arxiv

0+阅读 · 1月20日

Unlearning in LLMs: Methods, Evaluation, and Open Challenges

Arxiv

0+阅读 · 1月19日

Forgetting-MarI: LLM Unlearning via Marginal Information Regularization

Arxiv

0+阅读 · 1月17日

Continual Learning of Achieving Forgetting-free and Positive Knowledge Transfer

Arxiv

0+阅读 · 1月9日

Forget Less by Learning Together through Concept Consolidation

Arxiv

0+阅读 · 1月5日

Memory Bank Compression for Continual Adaptation of Large Language Models

Arxiv

0+阅读 · 1月2日

相关基金

循环神经网络多模态深度模型联想记忆功能研究

国家自然科学基金

6+阅读 · 2017年12月31日

针对大规模环境下复杂任务的策略搜索强化学习方法研究

国家自然科学基金

42+阅读 · 2015年12月31日

基于复杂图知识表示的终身强化学习研究

国家自然科学基金

39+阅读 · 2015年12月31日

面向人类工作记忆改善的脑电复杂网络信息反馈非线性计算模型研究

国家自然科学基金

0+阅读 · 2015年12月31日

面向交互式问答的省略恢复技术研究

国家自然科学基金

5+阅读 · 2015年12月31日

脑衰老过程中长链非编码RNA对学习记忆相关基因的调控功能及机制

国家自然科学基金

0+阅读 · 2015年12月31日

基于记忆的不变图像特征学习方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

强调与对比影响语篇理解的认知过程及其神经机制

国家自然科学基金

4+阅读 · 2015年12月31日

基于记忆学习与免疫系统的仿生控制研究

国家自然科学基金

7+阅读 · 2015年12月31日

学习与记忆的神经动力学研究

国家自然科学基金

1+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员