Continual Learning (CL) with large-scale pre-trained models (PTMs) has recently gained wide attention, shifting the focus from training from scratch to continually adapting PTMs. This has given rise to a promising paradigm: parameter-efficient continual learning (PECL), where task interference is typically mitigated by assigning a task-specific module during training, such as low-rank adapters. However, weight regularization techniques, such as Elastic Weight Consolidation (EWC)-a key strategy in CL-remain underexplored in this new paradigm. In this paper, we revisit weight regularization in low-rank CL as a new perspective for mitigating task interference in PECL. Unlike existing low-rank CL methods, we mitigate task interference by regularizing a shared low-rank update through EWC, thereby keeping the storage requirement and inference costs constant regardless of the number of tasks. Our proposed method EWC-LoRA leverages a low-rank representation to estimate parameter importance over the full-dimensional space. This design offers a practical, computational- and memory-efficient solution for CL with PTMs, and provides insights that may inform the broader application of regularization techniques within PECL. Extensive experiments on various benchmarks demonstrate the effectiveness of EWC-LoRA, achieving a stability-plasticity trade-off superior to existing low-rank CL approaches. These results indicate that, even under low-rank parameterizations, weight regularization remains an effective mechanism for mitigating task interference. Code is available at: https://github.com/yaoyz96/low-rank-cl.
翻译:近年来,大规模预训练模型(PTMs)的持续学习(CL)受到广泛关注,研究重心已从零开始训练转向对PTMs进行持续适应。这催生了一个前景广阔的范式:参数高效持续学习(PECL)。在该范式中,通常通过训练过程中分配任务特定模块(例如低秩适配器)来缓解任务干扰。然而,权重正则化技术(如持续学习中的关键策略——弹性权重固化(EWC))在这一新范式中仍未得到充分探索。本文从新的视角重新审视低秩持续学习中的权重正则化,将其作为缓解PECL中任务干扰的一种方法。与现有低秩持续学习方法不同,我们通过EWC对共享的低秩更新进行正则化来缓解任务干扰,从而使存储需求和推理成本不随任务数量增加而改变。我们提出的方法EWC-LoRA利用低秩表示来估计全维空间中的参数重要性。该设计为基于PTMs的持续学习提供了一个实用、计算高效且内存高效的解决方案,并为正则化技术在PECL中的更广泛应用提供了启示。在多种基准测试上的大量实验证明了EWC-LoRA的有效性,其实现了优于现有低秩持续学习方法的稳定性-可塑性平衡。这些结果表明,即使在低秩参数化条件下,权重正则化仍然是缓解任务干扰的有效机制。代码发布于:https://github.com/yaoyz96/low-rank-cl。