Knowledge editing for large language models can offer an efficient solution to alter a model's behavior without negatively impacting the overall performance. However, the current approaches encounter issues with limited generalizability across tasks, necessitating one distinct editor for each task, significantly hindering the broader applications. To address this, we take the first step to analyze the multi-task generalization issue in knowledge editing. Specifically, we develop an instruction-based editing technique, termed InstructEdit, which facilitates the editor's adaptation to various task performances simultaneously using simple instructions. With only one unified editor for each LLM, we empirically demonstrate that InstructEdit can improve the editor's control, leading to an average 14.86% increase in Reliability in multi-task editing setting. Furthermore, experiments involving holdout unseen task illustrate that InstructEdit consistently surpass previous strong baselines. To further investigate the underlying mechanisms of instruction-based knowledge editing, we analyze the principal components of the editing gradient directions, which unveils that instructions can help control optimization direction with stronger OOD generalization. Code and datasets are available in https://github.com/zjunlp/EasyEdit.
翻译:大语言模型的知识编辑能在不影响整体性能的情况下,提供一种高效修改模型行为的解决方案。然而,当前方法面临跨任务泛化能力有限的问题,需要为每个任务配备独立的编辑器,严重限制了更广泛的应用。为此,我们首次对知识编辑中的多任务泛化问题展开分析。具体而言,我们开发了一种基于指令的编辑技术InstructEdit,通过简单指令即可使编辑器同时适配不同任务的性能表现。针对每个大语言模型仅使用一个统一编辑器,实验证明InstructEdit能提升编辑器的可控性,在多任务编辑场景下可靠性平均提升14.86%。此外,在未见过的留出任务实验表明,InstructEdit始终优于先前强基线方法。为深入探究基于指令的知识编辑的内在机制,我们分析了编辑梯度方向的主成分,揭示指令能通过更强的分布外泛化能力辅助控制优化方向。代码与数据集可在https://github.com/zjunlp/EasyEdit获取。