Despite their exceptional capabilities, large language models (LLMs) are prone to generating unintended text due to false or outdated knowledge. Given the resource-intensive nature of retraining LLMs, there has been a notable increase in the development of knowledge editing. However, current approaches and evaluations rarely explore the perturbation of editing on neighboring knowledge. This paper studies whether updating new knowledge to LLMs perturbs the neighboring knowledge encapsulated within them. Specifically, we seek to figure out whether appending a new answer into an answer list to a factual question leads to catastrophic forgetting of original correct answers in this list, as well as unintentional inclusion of incorrect answers. A metric of additivity is introduced and a benchmark dubbed as Perturbation Evaluation of Appending Knowledge (PEAK) is constructed to evaluate the degree of perturbation to neighboring knowledge when appending new knowledge. Besides, a plug-and-play framework termed Appending via Preservation and Prevention (APP) is proposed to mitigate the neighboring perturbation by maintaining the integrity of the answer list. Experiments demonstrate the effectiveness of APP coupling with four editing methods on three LLMs.
翻译:尽管大语言模型(LLMs)具有卓越的能力,但由于错误或过时的知识,它们容易生成非预期的文本。鉴于重新训练LLMs需要大量资源,知识编辑技术的研究显著增加。然而,当前的方法和评估很少探索编辑对邻近知识的扰动。本文研究了向LLMs更新新知识是否会对其中包含的邻近知识产生扰动。具体而言,我们旨在探究:在事实性问题的答案列表中附加一个新答案,是否会导致该列表中原始正确答案的灾难性遗忘,以及非预期地包含错误答案。我们引入了可加性度量指标,并构建了一个名为"附加知识扰动评估"(PEAK)的基准数据集,用于评估附加新知识时对邻近知识的扰动程度。此外,我们提出了名为"通过保持与预防进行附加"(APP)的即插即用框架,通过维护答案列表的完整性来缓解邻近扰动。实验证明了APP框架在三种LLMs上与四种编辑方法耦合的有效性。