Recently, there has been a growing interest in knowledge editing for Large Language Models (LLMs). Current approaches and evaluations merely explore the instance-level editing, while whether LLMs possess the capability to modify concepts remains unclear. This paper pioneers the investigation of editing conceptual knowledge for LLMs, by constructing a novel benchmark dataset ConceptEdit and establishing a suite of new metrics for evaluation. The experimental results reveal that, although existing editing methods can efficiently modify concept-level definition to some extent, they also have the potential to distort the related instantial knowledge in LLMs, leading to poor performance. We anticipate this can inspire further progress in better understanding LLMs. Our project homepage is available at https://zjunlp.github.io/project/ConceptEdit.
翻译:近年来,针对大语言模型的知识编辑研究日益受到关注。现有方法与评估主要聚焦于实例层面的编辑,而大语言模型是否具备概念修改能力尚不明确。本文率先开展大语言模型概念知识编辑的研究,通过构建新型基准数据集ConceptEdit并建立一套全新的评估指标体系。实验结果表明,尽管现有编辑方法能在一定程度上有效修改概念层面的定义,但也可能扭曲模型中的相关实例知识,导致性能下降。我们期望这项工作能为进一步深入理解大语言模型提供启发。项目主页详见 https://zjunlp.github.io/project/ConceptEdit。