Class-Incremental Learning with CLIP: Adaptive Representation Adjustment and Parameter Fusion

Class-incremental learning is a challenging problem, where the goal is to train a model that can classify data from an increasing number of classes over time. With the advancement of vision-language pre-trained models such as CLIP, they demonstrate good generalization ability that allows them to excel in class-incremental learning with completely frozen parameters. However, further adaptation to downstream tasks by simply fine-tuning the model leads to severe forgetting. Most existing works with pre-trained models assume that the forgetting of old classes is uniform when the model acquires new knowledge. In this paper, we propose a method named Adaptive Representation Adjustment and Parameter Fusion (RAPF). During training for new data, we measure the influence of new classes on old ones and adjust the representations, using textual features. After training, we employ a decomposed parameter fusion to further mitigate forgetting during adapter module fine-tuning. Experiments on several conventional benchmarks show that our method achieves state-of-the-art results. Our code is available at \url{https://github.com/linlany/RAPF}.

翻译：类增量学习是一个具有挑战性的问题，其目标是训练一个能够随时间对来自不断增加类别的数据进行分类的模型。随着诸如CLIP等视觉语言预训练模型的发展，它们展现出良好的泛化能力，使其能够在参数完全冻结的情况下在类增量学习中表现出色。然而，通过简单地微调模型来进一步适应下游任务会导致严重的遗忘。大多数使用预训练模型的现有工作假设，当模型获取新知识时，对旧类别的遗忘是均匀的。在本文中，我们提出了一种名为自适应表征调整与参数融合的方法。在对新数据进行训练期间，我们衡量新类别对旧类别的影响，并使用文本特征来调整表征。训练结束后，我们采用一种分解的参数融合策略，以进一步减轻在适配器模块微调过程中的遗忘。在多个常规基准测试上的实验表明，我们的方法取得了最先进的结果。我们的代码可在 \url{https://github.com/linlany/RAPF} 获取。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/