Continual Learning of Language Models

Language models (LMs) have been instrumental for the rapid advance of natural language processing. This paper studies continual learning of LMs, in particular, continual domain-adaptive pre-training (or continual DAP-training). Existing research has shown that further pre-training an LM using a domain corpus to adapt the LM to the domain can improve the end-task performance in the domain. This paper proposes a novel method to continually DAP-train an LM with a sequence of unlabeled domain corpora to adapt the LM to these domains to improve their end-task performances. The key novelty of our method is a soft-masking mechanism that directly controls the update to the LM. A novel proxy is also proposed to preserve the general knowledge in the original LM. Additionally, it contrasts the representations of the previously learned domain knowledge (including the general knowledge in the pre-trained LM) and the knowledge from the current full network to achieve knowledge integration. The method not only overcomes catastrophic forgetting, but also achieves knowledge transfer to improve end-task performances. Empirical evaluation demonstrates the effectiveness of the proposed method.

翻译：语言模型（LMs）在自然语言处理的快速发展中发挥了关键作用。本文研究语言模型的持续学习，特别是持续领域自适应预训练（或持续DAP训练）。现有研究表明，使用领域语料库对语言模型进行进一步预训练以使其适应特定领域，可以提升该领域最终任务的性能。本文提出了一种新方法，通过一系列无标签领域语料库对语言模型进行持续DAP训练，使其适应这些领域，从而提升最终任务性能。该方法的核心创新在于一种软掩蔽机制，可直接控制语言模型的更新。同时，我们提出了一种新的代理机制，以保留原始语言模型中的通用知识。此外，该方法通过对比先前学习的领域知识（包括预训练语言模型中的通用知识）与当前完整网络中的知识表示，实现知识整合。该方法不仅克服了灾难性遗忘问题，还能通过知识迁移提升最终任务的性能。实验评估证明了所提方法的有效性。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

ChatGPT大模型全栈技术讲解！霍普金斯最新《NLP：自监督模型》2023课程全面讲解预训练指令学习和RLHF等技术，附讲义

专知会员服务

108+阅读 · 2023年4月8日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

105+阅读 · 2022年2月10日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日