Continual learning aims to incrementally acquire new concepts in data streams while resisting forgetting previous knowledge. With the rise of powerful pre-trained models (PTMs), there is a growing interest in training incremental learning systems using these foundation models, rather than learning from scratch. Existing works often view PTMs as a strong initial point and directly apply parameter-efficient tuning (PET) in the first session for adapting to downstream tasks. In the following sessions, most methods freeze model parameters for tackling forgetting issues. However, applying PET directly to downstream data cannot fully explore the inherent knowledge in PTMs. Additionally, freezing the parameters in incremental sessions hinders models' plasticity to novel concepts not covered in the first session. To solve the above issues, we propose a Slow And Fast parameter-Efficient tuning (SAFE) framework. In particular, to inherit general knowledge from foundation models, we include a transfer loss function by measuring the correlation between the PTM and the PET-applied model. After calibrating in the first session, the slow efficient tuning parameters can capture more informative features, improving generalization to incoming classes. Moreover, to further incorporate novel concepts, we strike a balance between stability and plasticity by fixing slow efficient tuning parameters and continuously updating the fast ones. Specifically, a cross-classification loss with feature alignment is proposed to circumvent catastrophic forgetting. During inference, we introduce an entropy-based aggregation strategy to dynamically utilize the complementarity in the slow and fast learners. Extensive experiments on seven benchmark datasets verify the effectiveness of our method by significantly surpassing the state-of-the-art.
翻译:持续学习旨在从数据流中逐步获取新概念,同时抵御对先前知识的遗忘。随着强大预训练模型(PTMs)的兴起,利用这些基础模型(而非从零开始学习)来训练增量学习系统正受到越来越多的关注。现有工作通常将PTMs视为一个强大的初始点,并在第一个会话中直接应用参数高效调优(PET)以适应下游任务。在后续会话中,大多数方法会冻结模型参数以应对遗忘问题。然而,直接将PET应用于下游数据无法充分挖掘PTMs中固有的知识。此外,在增量会话中冻结参数会阻碍模型对首个会话未涵盖的新概念的适应可塑性。为解决上述问题,我们提出了一种慢速与快速参数高效调优(SAFE)框架。具体而言,为继承基础模型的通用知识,我们引入了一个通过衡量PTM与PET应用模型之间相关性的迁移损失函数。在首个会话中进行校准后,慢速高效调优参数能够捕获更具信息量的特征,从而提升对后续新类别的泛化能力。此外,为进一步融入新概念,我们通过固定慢速高效调优参数并持续更新快速参数,在稳定性与可塑性之间取得平衡。具体地,我们提出了一种结合特征对齐的跨分类损失以避免灾难性遗忘。在推理阶段,我们引入了一种基于熵的聚合策略,以动态利用慢速与快速学习器之间的互补性。在七个基准数据集上的大量实验验证了我们方法的有效性,其性能显著超越了现有最优方法。