Online Adaptation of Language Models with a Memory of Amortized Contexts

Due to the rapid generation and dissemination of information, large language models (LLMs) quickly run out of date despite enormous development costs. To address the crucial need to keep models updated, online learning has emerged as a critical tool when utilizing LLMs for real-world applications. However, given the ever-expanding corpus of unseen documents and the large parameter space of modern LLMs, efficient adaptation is essential. To address these challenges, we propose Memory of Amortized Contexts (MAC), an efficient and effective online adaptation framework for LLMs with strong knowledge retention. We propose a feature extraction and memory-augmentation approach to compress and extract information from new documents into compact modulations stored in a memory bank. When answering questions, our model attends to and extracts relevant knowledge from this memory bank. To learn informative modulations in an efficient manner, we utilize amortization-based meta-learning, which substitutes an otherwise required optimization process with a single forward pass of the encoder. Subsequently, we learn to choose from and aggregate selected documents into a single modulation by conditioning on the question, allowing us to adapt a frozen language model during test time without requiring further gradient updates. Our experiment demonstrates the superiority of MAC in multiple aspects, including online adaptation performance, time, and memory efficiency. In addition, we show how MAC can be combined with and improve the performance of popular alternatives such as retrieval augmented generations (RAGs). Code is available at: https://github.com/jihoontack/MAC.

翻译：由于信息的快速生成与传播，尽管付出了巨大的开发成本，大型语言模型（LLMs）仍会迅速过时。为满足保持模型更新的关键需求，在现实应用中部署LLMs时，在线学习已成为一项重要工具。然而，面对不断增长的未见文档语料库以及现代LLMs庞大的参数空间，高效自适应至关重要。为应对这些挑战，我们提出了摊销上下文记忆（MAC），这是一个面向LLMs的高效且具备强大知识保持能力的在线自适应框架。我们提出了一种特征提取与记忆增强方法，用于从新文档中压缩并提取信息，将其转化为存储在记忆库中的紧凑调制向量。在回答问题时，我们的模型能够关注并从此记忆库中提取相关知识。为高效学习信息丰富的调制向量，我们采用基于摊销的元学习方法，用编码器的单次前向传播替代原本所需的优化过程。随后，我们通过学习根据问题条件从选定文档中进行选择并将其聚合为单一调制向量，从而实现在测试阶段无需梯度更新即可自适应冻结的语言模型。实验结果表明，MAC在在线自适应性能、时间与内存效率等多个方面均具有优越性。此外，我们还展示了MAC如何与检索增强生成（RAG）等流行方法结合并提升其性能。代码发布于：https://github.com/jihoontack/MAC。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日