Recent work using pretrained transformers has shown impressive performance when fine-tuned with data from the downstream problem of interest. However, they struggle to retain that performance when the data characteristics changes. In this paper, we focus on continual learning, where a pre-trained transformer is updated to perform well on new data, while retaining its performance on data it was previously trained on. Earlier works have tackled this primarily through methods inspired from prompt tuning. We question this choice, and investigate the applicability of Low Rank Adaptation (LoRA) to continual learning. On a range of domain-incremental learning benchmarks, our LoRA-based solution, CoLoR, yields state-of-the-art performance, while still being as parameter efficient as the prompt tuning based methods.
翻译:近期使用预训练Transformer的工作表明,通过下游目标问题的数据进行微调后能获得令人瞩目的性能。然而,当数据特征发生变化时,这类模型难以保持原有性能。本文聚焦于持续学习场景:即更新预训练Transformer使其在保留先前训练数据性能的同时,在新数据上展现优异表现。此前研究主要通过基于提示微调的方法解决该问题,本文对此提出质疑并探索低秩自适应(LoRA)在持续学习中的适用性。在多个域增量学习基准测试中,我们提出的基于LoRA的解决方案CoLoR取得了当前最优性能,同时仍保持与基于提示微调方法相当的参数效率。