Foundation models encompass an extensive knowledge base and offer remarkable transferability. However, this knowledge becomes outdated or insufficient over time. The challenge lies in continuously updating foundation models to accommodate novel information while retaining their original capabilities. Leveraging the fact that foundation models have initial knowledge on various tasks and domains, we propose a novel approach that, instead of updating all parameters equally, localizes the updates to a sparse set of parameters relevant to the task being learned. We strike a balance between efficiency and new tasks performance, while maintaining the transferability and generalizability of foundation models. We extensively evaluate our method on foundational vision-language models with a diverse spectrum of continual learning tasks. Our method achieves improvements on the newly learned tasks accuracy up to 7% while preserving the pretraining knowledge with a negligible decrease of 0.9% on a representative control set accuracy.
翻译:基础模型包含广泛的知识库,并展现出卓越的可迁移性。然而,这些知识会随着时间的推移而过时或不足。挑战在于持续更新基础模型,使其既能容纳新信息,又能保留原有能力。利用基础模型在各项任务和领域中具备初始知识这一事实,我们提出了一种新方法:并非均匀更新所有参数,而是将更新定位到与正在学习任务相关的稀疏参数集上。我们在效率与新任务性能之间取得了平衡,同时保持了基础模型的可迁移性和泛化能力。我们在多种持续学习任务的视觉语言基础模型上对本方法进行了全面评估。所提方法在新学习任务上的准确率提升高达7%,同时将预训练知识的保留损失控制在极小范围内——在代表性控制集上的准确率下降仅为0.9%。