Foundation models encompass an extensive knowledge base and offer remarkable transferability. However, this knowledge becomes outdated or insufficient over time. The challenge lies in continuously updating foundation models to accommodate novel information while retaining their original capabilities. Leveraging the fact that foundation models have initial knowledge on various tasks and domains, we propose a novel approach that, instead of updating all parameters equally, localizes the updates to a sparse set of parameters relevant to the task being learned. We strike a balance between efficiency and new task performance, while maintaining the transferability and generalizability of foundation models. We extensively evaluate our method on foundational vision-language models with a diverse spectrum of continual learning tasks. Our method achieves improvements on the accuracy of the newly learned tasks up to 7% while preserving the pretraining knowledge with a negligible decrease of 0.9% on a representative control set accuracy.
翻译:基础模型蕴含广泛的知识库并展现出卓越的可迁移性。然而,这种知识会随时间推移变得过时或不足。挑战在于持续更新基础模型以容纳新信息,同时保持其原有能力。利用基础模型在各类任务和领域已具备初始知识这一特性,我们提出了一种新方法:并非对所有参数进行同等更新,而是将更新定位在与当前学习任务相关的稀疏参数集上。我们在效率与新任务性能之间取得了平衡,同时维持了基础模型的可迁移性和泛化能力。我们在一系列持续学习任务中对基础视觉-语言模型进行了广泛评估。我们的方法在保证新学习任务准确率提升高达7%的同时,对预训练知识的保持仅产生0.9%的微小下降(基于代表性控制集准确率)。