Foundation models, including Vision Language Models (VLMs) and Large Language Models (LLMs), possess the $generality$ to handle diverse distributions and tasks, which stems from their extensive pre-training datasets. The fine-tuning of foundation models is a common practice to enhance task performance or align the model's behavior with human expectations, allowing them to gain $speciality$. However, the small datasets used for fine-tuning may not adequately cover the diverse distributions and tasks encountered during pre-training. Consequently, the pursuit of speciality during fine-tuning can lead to a loss of {generality} in the model, which is related to catastrophic forgetting (CF) in deep learning. In this study, we demonstrate this phenomenon in both VLMs and LLMs. For instance, fine-tuning VLMs like CLIP on ImageNet results in a loss of generality in handling diverse distributions, and fine-tuning LLMs like Galactica in the medical domain leads to a loss in following instructions and common sense. To address the trade-off between the speciality and generality, we investigate multiple regularization methods from continual learning, the weight averaging method (Wise-FT) from out-of-distributional (OOD) generalization, which interpolates parameters between pre-trained and fine-tuned models, and parameter-efficient fine-tuning methods like Low-Rank Adaptation (LoRA). Our findings show that both continual learning and Wise-ft methods effectively mitigate the loss of generality, with Wise-FT exhibiting the strongest performance in balancing speciality and generality.
翻译:基础模型,包括视觉语言模型(VLM)和大语言模型(LLM),因其大规模预训练数据集而具备处理多样化分布与任务的“通用性”。对基础模型进行微调是提升任务性能或使模型行为符合人类期望的常见手段,使其获得“专长”。然而,微调所用的小规模数据集可能无法充分覆盖预训练阶段遭遇的多样化分布与任务。因此,微调过程中对专长的追求可能导致模型通用性的损失,这与深度学习中的灾难性遗忘(CF)相关。本研究在VLM和LLM中均验证了这一现象。例如,在ImageNet上微调CLIP等VLM会使其处理多样化分布的通用性下降,而在医学领域微调Galactica等LLM则会削弱其指令遵循能力和常识推理能力。为解决专长与通用性之间的权衡问题,我们探讨了多种方法:来自持续学习的正则化方法、来自分布外(OOD)泛化的权重平均方法(Wise-FT,通过插值预训练与微调模型参数实现),以及低秩适配(LoRA)等参数高效微调方法。研究发现,持续学习与Wise-Ft方法均能有效缓解通用性损失,其中Wise-FT在平衡专长与通用性方面表现最佳。