SAFE: Slow and Fast Parameter-Efficient Tuning for Continual Learning with Pre-Trained Models

Continual learning aims to incrementally acquire new concepts in data streams while resisting forgetting previous knowledge. With the rise of powerful pre-trained models (PTMs), there is a growing interest in training incremental learning systems using these foundation models, rather than learning from scratch. Existing works often view PTMs as a strong initial point and directly apply parameter-efficient tuning (PET) in the first session for adapting to downstream tasks. In the following sessions, most methods freeze model parameters for tackling forgetting issues. However, applying PET directly to downstream data cannot fully explore the inherent knowledge in PTMs. Additionally, freezing the parameters in incremental sessions hinders models' plasticity to novel concepts not covered in the first session. To solve the above issues, we propose a Slow And Fast parameter-Efficient tuning (SAFE) framework. In particular, to inherit general knowledge from foundation models, we include a transfer loss function by measuring the correlation between the PTM and the PET-applied model. After calibrating in the first session, the slow efficient tuning parameters can capture more informative features, improving generalization to incoming classes. Moreover, to further incorporate novel concepts, we strike a balance between stability and plasticity by fixing slow efficient tuning parameters and continuously updating the fast ones. Specifically, a cross-classification loss with feature alignment is proposed to circumvent catastrophic forgetting. During inference, we introduce an entropy-based aggregation strategy to dynamically utilize the complementarity in the slow and fast learners. Extensive experiments on seven benchmark datasets verify the effectiveness of our method by significantly surpassing the state-of-the-art.

翻译：持续学习旨在从数据流中逐步获取新概念，同时抵御对先前知识的遗忘。随着强大预训练模型（PTMs）的兴起，利用这些基础模型（而非从零开始学习）来训练增量学习系统正受到越来越多的关注。现有工作通常将PTMs视为一个强大的初始点，并在第一个会话中直接应用参数高效调优（PET）以适应下游任务。在后续会话中，大多数方法会冻结模型参数以应对遗忘问题。然而，直接将PET应用于下游数据无法充分挖掘PTMs中固有的知识。此外，在增量会话中冻结参数会阻碍模型对首个会话未涵盖的新概念的适应可塑性。为解决上述问题，我们提出了一种慢速与快速参数高效调优（SAFE）框架。具体而言，为继承基础模型的通用知识，我们引入了一个通过衡量PTM与PET应用模型之间相关性的迁移损失函数。在首个会话中进行校准后，慢速高效调优参数能够捕获更具信息量的特征，从而提升对后续新类别的泛化能力。此外，为进一步融入新概念，我们通过固定慢速高效调优参数并持续更新快速参数，在稳定性与可塑性之间取得平衡。具体地，我们提出了一种结合特征对齐的跨分类损失以避免灾难性遗忘。在推理阶段，我们引入了一种基于熵的聚合策略，以动态利用慢速与快速学习器之间的互补性。在七个基准数据集上的大量实验验证了我们方法的有效性，其性能显著超越了现有最优方法。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日