Knowledge editing aims to change language models' performance on several special cases (i.e., editing scope) by infusing the corresponding expected knowledge into them. With the recent advancements in large language models (LLMs), knowledge editing has been shown as a promising technique to adapt LLMs to new knowledge without retraining from scratch. However, most of the previous studies neglect the multi-lingual nature of some main-stream LLMs (e.g., LLaMA, ChatGPT and GPT-4), and typically focus on monolingual scenarios, where LLMs are edited and evaluated in the same language. As a result, it is still unknown the effect of source language editing on a different target language. In this paper, we aim to figure out this cross-lingual effect in knowledge editing. Specifically, we first collect a large-scale cross-lingual synthetic dataset by translating ZsRE from English to Chinese. Then, we conduct English editing on various knowledge editing methods covering different paradigms, and evaluate their performance in Chinese, and vice versa. To give deeper analyses of the cross-lingual effect, the evaluation includes four aspects, i.e., reliability, generality, locality and portability. Furthermore, we analyze the inconsistent behaviors of the edited models and discuss their specific challenges. Data and codes are available at https://github.com/krystalan/Bi_ZsRE
翻译:知识编辑旨在通过向语言模型注入相应的预期知识,改变其在若干特定案例(即编辑范围)上的表现。随着大语言模型(LLMs)的最新进展,知识编辑已被证明是一种无需从头训练即可使LLMs适应新知识的有前景的技术。然而,先前的研究大多忽视了部分主流LLMs(如LLaMA、ChatGPT和GPT-4)的多语言特性,通常专注于单语言场景,即在同一种语言中对LLMs进行编辑和评估。因此,源语言编辑对不同目标语言的影响尚不明确。本文旨在探究知识编辑中的这种跨语言效应。具体而言,我们首先通过将ZsRE从英语翻译成中文,构建了一个大规模跨语言合成数据集。随后,我们使用涵盖不同范式的多种知识编辑方法进行英语编辑,并评估其在中文中的表现,反之亦然。为了深入分析跨语言效应,评估包含四个方面:可靠性、泛化性、局部性与可移植性。此外,我们分析了编辑后模型的不一致行为,并讨论了其具体挑战。数据与代码公开于https://github.com/krystalan/Bi_ZsRE。