Large Language Models (LLMs) power numerous AI applications, yet updating their knowledge remains costly. Model editing provides a lightweight alternative through targeted parameter modifications, with meta-learning-based model editing (MLME) demonstrating strong effectiveness and efficiency. However, we find that MLME struggles in low-data regimes and incurs high training costs due to the use of KL divergence. To address these issues, we propose $\textbf{E}$fficient $\textbf{M}$ulti-$\textbf{S}$tep $\textbf{Edit (EMSEdit)}$, which leverages multi-step backpropagation (MSBP) to effectively capture gradient-activation mapping patterns within editing samples, performs multi-step edits per sample to enhance editing performance under limited data, and introduces norm-based regularization to preserve unedited knowledge while improving training efficiency. Experiments on two datasets and three LLMs show that EMSEdit consistently outperforms state-of-the-art methods in both sequential and batch editing. Moreover, MSBP can be seamlessly integrated into existing approaches to yield additional performance gains. Further experiments on a multi-hop reasoning editing task demonstrate EMSEdit's robustness in handling complex edits, while ablation studies validate the contribution of each design component. Our code is available at https://github.com/xpq-tech/emsedit.
翻译:大型语言模型(LLM)为众多人工智能应用提供核心支持,但其知识更新成本高昂。模型编辑通过针对性参数修改提供了一种轻量级替代方案,其中基于元学习的模型编辑(MLME)展现出卓越的效能与效率。然而,我们发现MLME在低数据场景下表现欠佳,且因使用KL散度导致训练成本较高。为解决这些问题,我们提出**高效多步编辑方法(EMSEdit)**:该方法利用多步反向传播(MSBP)有效捕捉编辑样本内部的梯度-激活映射模式,对每个样本执行多步编辑以提升有限数据下的编辑性能,并引入基于范数的正则化方法在保持未编辑知识的同时提升训练效率。在两个数据集和三种LLM上的实验表明,EMSEdit在序列编辑与批量编辑中均持续优于现有最优方法。此外,MSBP可无缝集成至现有方法中以获得额外性能提升。在多跳推理编辑任务上的进一步实验证明了EMSEdit处理复杂编辑任务的鲁棒性,消融研究则验证了各设计模块的有效性。代码已开源:https://github.com/xpq-tech/emsedit。