Large language models (LLMs) have significantly advanced the field of natural language processing (NLP) through their extensive parameters and comprehensive data utilization. However, existing LLMs lack a dedicated memory unit, limiting their ability to explicitly store and retrieve knowledge for various tasks. In this paper, we propose RET-LLM a novel framework that equips LLMs with a general write-read memory unit, allowing them to extract, store, and recall knowledge from the text as needed for task performance. Inspired by Davidsonian semantics theory, we extract and save knowledge in the form of triplets. The memory unit is designed to be scalable, aggregatable, updatable, and interpretable. Through qualitative evaluations, we demonstrate the superiority of our proposed framework over baseline approaches in question answering tasks. Moreover, our framework exhibits robust performance in handling temporal-based question answering tasks, showcasing its ability to effectively manage time-dependent information.
翻译:大型语言模型(LLM)通过其庞大的参数量和对海量数据的综合运用,显著推动了自然语言处理(NLP)领域的发展。然而,现有的大型语言模型缺乏专用的记忆单元,这限制了它们为不同任务显式存储和检索知识的能力。本文提出RET-LLM,一种新颖的框架,为大型语言模型配备了一个通用的读写记忆单元,使其能够根据任务执行的需要,从文本中提取、存储和调用知识。受戴维森语义学理论的启发,我们以三元组的形式提取和保存知识。该记忆单元被设计为可扩展、可聚合、可更新且可解释的。通过定性评估,我们证明了所提框架在问答任务上优于基线方法。此外,我们的框架在处理基于时间的问答任务时表现出鲁棒的性能,展示了其有效管理时间依赖性信息的能力。