Beyond Anti-Forgetting: Multimodal Continual Instruction Tuning with Positive Forward Transfer

Multimodal Continual Instruction Tuning (MCIT) enables Multimodal Large Language Models (MLLMs) to meet continuously emerging requirements without expensive retraining. MCIT faces two major obstacles: catastrophic forgetting (where old knowledge is forgotten) and negative forward transfer (where the performance of future tasks is degraded). Although existing methods have greatly alleviated catastrophic forgetting, they still suffer from negative forward transfer. By performing singular value decomposition (SVD) on input embeddings, we discover a large discrepancy in different input embeddings. The discrepancy results in the model learning irrelevant information for old and pre-trained tasks, which leads to catastrophic forgetting and negative forward transfer. To address these issues, we propose Fwd-Prompt, a prompt-based method projecting prompt gradient to the residual space to minimize the interference between tasks and to the pre-trained subspace for reusing pre-trained knowledge. Our experiments demonstrate that Fwd-Prompt achieves state-of-the-art performance while updating fewer parameters and requiring no old samples. Our research sheds light on the potential of continuously adapting MLLMs to new tasks under the instruction tuning paradigm and encourages future studies to explore MCIT. The code will soon be publicly available.

翻译：多模态持续指令微调（MCIT）使多模态大语言模型（MLLMs）能够满足不断涌现的新需求，而无需昂贵的重新训练。MCIT面临两大障碍：灾难性遗忘（旧知识被遗忘）和负向正向迁移（未来任务性能下降）。尽管现有方法已大幅缓解灾难性遗忘问题，但仍受困于负向正向迁移。通过对输入嵌入执行奇异值分解（SVD），我们发现不同输入嵌入之间存在显著差异。这种差异导致模型学习到与旧任务及预训练任务无关的信息，进而引发灾难性遗忘和负向正向迁移。为解决这些问题，我们提出Fwd-Prompt——一种基于提示的方法，将提示梯度投影至残差空间以最小化任务间干扰，并投影至预训练子空间以复用预训练知识。实验表明，Fwd-Prompt在减少参数更新量且无需旧样本的情况下达到了最优性能。本研究揭示了在指令微调范式下持续适配MLLMs至新任务的潜力，并鼓励未来探索MCIT领域。代码即将开源。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日