WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models

Large language models (LLMs) need knowledge updates to meet the ever-growing world facts and correct the hallucinated responses, facilitating the methods of lifelong model editing. Where the updated knowledge resides in memories is a fundamental question for model editing. In this paper, we find that editing either long-term memory (direct model parameters) or working memory (non-parametric knowledge of neural network activations/representations by retrieval) will result in an impossible triangle -- reliability, generalization, and locality can not be realized together in the lifelong editing settings. For long-term memory, directly editing the parameters will cause conflicts with irrelevant pretrained knowledge or previous edits (poor reliability and locality). For working memory, retrieval-based activations can hardly make the model understand the edits and generalize (poor generalization). Therefore, we propose WISE to bridge the gap between memories. In WISE, we design a dual parametric memory scheme, which consists of the main memory for the pretrained knowledge and a side memory for the edited knowledge. We only edit the knowledge in the side memory and train a router to decide which memory to go through when given a query. For continual editing, we devise a knowledge-sharding mechanism where different sets of edits reside in distinct subspaces of parameters, and are subsequently merged into a shared memory without conflicts. Extensive experiments show that WISE can outperform previous model editing methods and overcome the impossible triangle under lifelong model editing of question answering, hallucination, and out-of-distribution settings across trending LLM architectures, e.g., GPT, LLaMA, and Mistral. Code is available at https://github.com/zjunlp/EasyEdit.

翻译：大语言模型（LLMs）需要持续更新知识以适应不断增长的世界事实并修正其幻觉性响应，这推动了终身模型编辑方法的发展。更新后的知识存储于何种记忆之中是模型编辑的一个根本性问题。本文发现，无论是编辑长期记忆（直接修改模型参数）还是工作记忆（通过检索获取神经网络激活/表征的非参数化知识），都会导致一个“不可能三角”——在终身编辑场景下，可靠性、泛化性与局部性无法同时实现。对于长期记忆，直接编辑参数会与无关的预训练知识或先前编辑产生冲突（导致可靠性差和局部性差）；对于工作记忆，基于检索的激活机制难以使模型真正理解编辑内容并实现泛化（泛化能力差）。为此，我们提出WISE框架以弥合不同记忆机制间的差距。在WISE中，我们设计了一种双参数化记忆方案，包含存储预训练知识的主记忆和存储编辑知识的侧记忆。我们仅对侧记忆中的知识进行编辑，并训练一个路由模块来根据查询决定调用哪种记忆。针对持续编辑场景，我们提出一种知识分片机制：不同的编辑集合存储于参数空间的不同子空间中，随后被合并至共享记忆且互不冲突。大量实验表明，在问答、幻觉和分布外场景的终身模型编辑任务中，WISE能够超越现有模型编辑方法，并在GPT、LLaMA、Mistral等主流大语言模型架构上突破“不可能三角”的限制。代码已开源：https://github.com/zjunlp/EasyEdit。