Model editing has emerged as a practical approach for mitigating factual errors and outdated knowledge in large language models (LLMs). Among existing methods, the Locate-and-Edit (L&E) paradigm is the dominant framework: it locates MLP parameters implicated in expressing a target fact, and then performs a localized update to rewrite that fact. However, long sequences of edits often trigger abrupt model collapse in L&E beyond a critical point. We empirically identify a strong correlation between collapse and explosive growth of edited MLP weight norms, and formally prove that commonly used L&E update rules can induce exponential norm growth across sequential edits in the absence of explicit norm control. To address this issue, we propose Norm-Anchor Scaling NAS, a plug-and-play norm-constrained strategy. Across extensive experiments, NAS delays the collapse point of representative L&E algorithms by more than 4 times and yields a 72.2% average relative gain in editing performance, requiring only a single additional line of code and incurring negligible computational overhead.
翻译:模型编辑已成为缓解大型语言模型(LLMs)中事实性错误与过时知识的一种实用方法。在现有方法中,定位-编辑(L&E)范式是主导框架:它定位与表达目标事实相关的MLP参数,然后执行局部更新以重写该事实。然而,长序列的编辑操作在L&E中常常会触发超过临界点的模型突然崩溃。我们通过实证研究发现,崩溃与已编辑MLP权重范数的爆炸性增长之间存在强相关性,并严格证明在缺乏显式范数控制的情况下,常用的L&E更新规则会引发跨序列编辑的指数级范数增长。为解决此问题,我们提出范数锚定缩放(NAS),一种即插即用的范数约束策略。在大量实验中,NAS将代表性L&E算法的崩溃点延迟了4倍以上,并在编辑性能上实现了72.2%的平均相对增益,仅需增加一行代码且计算开销可忽略不计。