Continual learning for pre-trained vision-language models requires balancing three competing objectives: retaining pre-trained knowledge, preserving knowledge from a sequence of learned tasks, and maintaining the plasticity to acquire new knowledge. This paper presents a simple but effective approach called KeepLoRA to effectively balance these objectives. We first analyze the knowledge retention mechanism within the model parameter space and find that general knowledge is mainly encoded in the principal subspace, while task-specific knowledge is encoded in the residual subspace. Motivated by this finding, KeepLoRA learns new tasks by restricting LoRA parameter updates in the residual subspace to prevent interfering with previously learned capabilities. Specifically, we infuse knowledge for a new task by projecting its gradient onto a subspace orthogonal to both the principal subspace of pre-trained model and the dominant directions of previous task features. Our theoretical and empirical analyses confirm that KeepLoRA balances the three objectives and achieves state-of-the-art performance. The implementation code is available at https://github.com/MaolinLuo/KeepLoRA.
翻译:预训练视觉语言模型的持续学习需要平衡三个相互竞争的目标:保留预训练知识、维护已学习任务序列的知识,以及保持获取新知识的可塑性。本文提出一种名为KeepLoRA的简洁而有效的方法,以平衡这些目标。我们首先分析了模型参数空间内的知识保持机制,发现通用知识主要编码在主成分子空间中,而任务特定知识则编码在残差子空间中。受此发现启发,KeepLoRA通过将LoRA参数更新限制在残差子空间中来学习新任务,从而避免干扰先前已习得的能力。具体而言,我们通过将新任务的梯度投影到与预训练模型主成分子空间及先前任务特征主导方向均正交的子空间中来注入新任务知识。理论与实证分析均证实,KeepLoRA能有效平衡上述三个目标,并取得最先进的性能。实现代码发布于https://github.com/MaolinLuo/KeepLoRA。