Large Language models (LLMs) suffer from forgetting of upstream data when fine-tuned. Despite efforts on mitigating forgetting, few have investigated whether, and how forgotten upstream examples are dependent on and associated with newly learned tasks. Insights on such associations enable efficient and targeted mitigation of forgetting. In this paper, we empirically analyze forgetting (measured in log-perplexity increase) that occurs in $N$ upstream examples of language modeling or instruction-tuning after fine-tuning LLMs on one of $M$ new tasks, visualized in $M\times N$ matrices. We demonstrate that the matrices display simple low-rank patterns, often well-approximated with multiplicative scalar effects of upstream examples and newly learned tasks. We also examine fine-grained associations with visualization and statistics. Leveraging the low-rank nature of the associations, we predict forgetting of upstream examples when fine-tuning on unseen tasks with matrix completion over the empirical associations. This enables fast identification of most forgotten examples without expensive inference on the entire upstream data. The approach, despite simplicity, outperforms prior approaches that learn semantic relationships of learned tasks and upstream examples with LMs for predicting forgetting. We demonstrate the practical utility of our analysis by showing statistically significantly reduced forgetting as we upweight predicted examples for replay at fine-tuning. Project page: https://inklab.usc.edu/lm-forgetting-prediction/
翻译:大型语言模型(LLM)在微调过程中常出现对上游数据的遗忘现象。尽管已有诸多研究致力于缓解遗忘,但关于被遗忘的上游示例是否以及如何与新学习任务存在依赖和关联的探讨仍较为有限。揭示此类关联机制有助于实现高效且有针对性的遗忘缓解。本文通过实证分析,研究了语言模型在$M$个新任务上微调后,在$N$个语言建模或指令微调上游示例中出现的遗忘现象(以对数困惑度增量衡量),并将其可视化为$M\times N$矩阵。实验表明,这些矩阵呈现出简单的低秩特性,通常可通过上游示例与新学习任务之间的乘性标量效应进行良好近似。我们进一步通过可视化与统计分析探究了细粒度关联。基于关联的低秩特性,我们通过对经验关联矩阵进行矩阵补全来预测模型在未见任务上微调时对上游示例的遗忘程度。该方法无需对整个上游数据进行昂贵推理即可快速识别最易遗忘的示例。尽管方法简洁,其在预测遗忘方面优于先前基于语言模型学习任务与上游示例语义关系的方法。最后,我们通过实验验证了本分析的实际效用:在微调过程中对预测示例进行加权重放,可显著降低遗忘程度(统计显著性)。项目页面:https://inklab.usc.edu/lm-forgetting-prediction/