While current large language models (LLMs) demonstrate some capabilities in knowledge-intensive tasks, they are limited by relying on their parameters as an implicit storage mechanism. As a result, they struggle with infrequent knowledge and temporal degradation. In addition, the uninterpretable nature of parametric memorization makes it challenging to understand and prevent hallucination. Parametric memory pools and model editing are only partial solutions. Retrieval Augmented Generation (RAG) $\unicode{x2013}$ though non-parametric $\unicode{x2013}$ has its own limitations: it lacks structure, complicates interpretability and makes it hard to effectively manage stored knowledge. In this paper, we introduce MemLLM, a novel method of enhancing LLMs by integrating a structured and explicit read-and-write memory module. MemLLM tackles the aforementioned challenges by enabling dynamic interaction with the memory and improving the LLM's capabilities in using stored knowledge. Our experiments indicate that MemLLM enhances the LLM's performance and interpretability, in language modeling in general and knowledge-intensive tasks in particular. We see MemLLM as an important step towards making LLMs more grounded and factual through memory augmentation.
翻译:尽管当前的大语言模型(LLMs)在知识密集型任务中展现出一定能力,但因其依赖模型参数作为隐式存储机制而受到限制。这导致模型在处理低频知识和应对知识时效性衰减时表现不足。此外,参数化记忆的不可解释性使得理解和防止幻觉现象变得困难。参数化记忆池与模型编辑仅能提供部分解决方案。检索增强生成(RAG)虽属非参数化方法,但存在自身局限性:缺乏结构化、增加可解释性难度,且难以有效管理存储知识。本文提出MemLLM,一种通过集成结构化显式读写记忆模块来增强大语言模型的新方法。MemLLM通过实现与记忆的动态交互、提升模型运用存储知识的能力,从而应对上述挑战。实验表明,MemLLM能增强大语言模型在语言建模(尤其是知识密集型任务)中的表现与可解释性。我们认为MemLLM是通过记忆增强使大语言模型更接地气、更符合事实的重要进展。