Continual learning (CL) aims to continually accumulate knowledge from a non-stationary data stream without catastrophic forgetting of learned knowledge, requiring a balance between stability and adaptability. Relying on the generalizable representation in pre-trained models (PTMs), PTM-based CL methods perform effective continual adaptation on downstream tasks by adding learnable adapters or prompts upon the frozen PTMs. However, many existing PTM-based CL methods use restricted adaptation on a fixed set of these modules to avoid forgetting, suffering from limited CL ability. Periodically adding task-specific modules results in linear model growth rate and impaired knowledge reuse. We propose Self-Expansion of pre-trained models with Modularized Adaptation (SEMA), a novel approach to enhance the control of stability-plasticity balance in PTM-based CL. SEMA automatically decides to reuse or add adapter modules on demand in CL, depending on whether significant distribution shift that cannot be handled is detected at different representation levels. We design modular adapter consisting of a functional adapter and a representation descriptor. The representation descriptors are trained as a distribution shift indicator and used to trigger self-expansion signals. For better composing the adapters, an expandable weighting router is learned jointly for mixture of adapter outputs. SEMA enables better knowledge reuse and sub-linear expansion rate. Extensive experiments demonstrate the effectiveness of the proposed self-expansion method, achieving state-of-the-art performance compared to PTM-based CL methods without memory rehearsal. Code is available at https://github.com/huiyiwang01/SEMA-CL.
翻译:持续学习(Continual Learning, CL)旨在从非平稳数据流中持续积累知识,同时避免对已学知识的灾难性遗忘,这需要在稳定性与适应性之间取得平衡。基于预训练模型(Pre-trained Models, PTMs)的CL方法利用PTM中泛化性强的表征,通过在冻结的PTM上添加可学习的适配器(adapters)或提示(prompts)来对下游任务进行有效的持续适应。然而,现有许多基于PTM的CL方法为避免遗忘,通常仅对一组固定的模块进行受限的适应,导致其持续学习能力有限。周期性添加任务专用模块会导致模型规模线性增长并损害知识重用。本文提出一种基于模块化适配的预训练模型自扩展方法(Self-Expansion of pre-trained models with Modularized Adaptation, SEMA),以增强基于PTM的CL中稳定性-可塑性平衡的控制能力。SEMA根据在不同表征层次上检测到的、无法处理的显著分布偏移,自动决定在CL过程中重用现有适配器模块或添加新模块。我们设计了由功能适配器与表征描述符组成的模块化适配器,其中表征描述符被训练为分布偏移指示器,用于触发自扩展信号。为更好地组合适配器,我们联合学习了一个可扩展的加权路由器,用于混合各适配器的输出。SEMA实现了更优的知识重用与亚线性扩展速率。大量实验证明了所提自扩展方法的有效性,在不进行记忆回放的情况下,其性能优于现有基于PTM的CL方法,达到当前最优水平。代码发布于 https://github.com/huiyiwang01/SEMA-CL。