Low-Rank Adaptation (LoRA) is the dominant parameter-efficient fine-tuning method due to its favorable compute-performance trade-off, yet it suffers from catastrophic forgetting. We study forgetting through a tractable _mean-field self-attention_ toy model, where tokens evolve as an interacting particle system and LoRA acts as a low-rank perturbation. Using tools from partial differential equations and dynamical systems, we characterize regimes suggesting a phase transition between forgetting and non-forgetting behavior. We show that one phase transition appears with respect to the norm of the perturbation, and the other with respect to the depth of the Transformers. We further bound the time-to-deviation in terms of the perturbation size and spectral quantities, and corroborate the predicted trends with experiments and exploratory analyses on real models under LoRA fine-tuning.
翻译:低秩适配(LoRA)是一种主流参数高效微调方法,因其良好的计算性能权衡而被广泛采用,但存在灾难性遗忘问题。我们通过一个可解析的均值场自注意力简化模型研究遗忘现象——在该模型中,令牌作为相互作用粒子系统演化,而LoRA充当低秩扰动。利用偏微分方程和动力系统工具,我们刻画了遗忘与非遗忘行为之间存在相变的相区特征。研究表明,关于扰动范数存在一个相变,关于Transformer深度存在另一个相变。我们进一步根据扰动大小和谱量给出偏离时间的上界,并通过LoRA微调下真实模型的实验与探索性分析验证了理论预测趋势。