Distributed Multi-Layer Editing for Rule-Level Knowledge in Large Language Models

Large language models store not only isolated facts but also rules that support reasoning across symbolic expressions, natural language explanations, and concrete instances. Yet most model editing methods are built for fact-level knowledge, assuming that a target edit can be achieved through a localized intervention. This assumption does not hold for rule-level knowledge, where a single rule must remain consistent across multiple interdependent forms. We investigate this problem through a mechanistic study of rule-level knowledge editing. To support this study, we extend the RuleEdit benchmark from 80 to 200 manually verified rules spanning mathematics and physics. Fine-grained causal tracing reveals a form-specific organization of rule knowledge in transformer layers: formulas and descriptions are concentrated in earlier layers, while instances are more associated with middle layers. These results suggest that rule knowledge is not uniformly localized, and therefore cannot be reliably edited by a single-layer or contiguous-block intervention. Based on this insight, we propose Distributed Multi-Layer Editing (DMLE), which applies a shared early-layer update to formulas and descriptions and a separate middle-layer update to instances. While remaining competitive on standard editing metrics, DMLE achieves substantially stronger rule-level editing performance. On average, it improves instance portability and rule understanding by 13.91 and 50.19 percentage points, respectively, over the strongest baseline across GPT-J-6B, Qwen2.5-7B, Qwen2-7B, and LLaMA-3-8B. The code is available at https://github.com/Pepper66/DMLE.

翻译：大型语言模型不仅存储孤立的事实，还存储支持符号表达式、自然语言解释和具体实例之间推理的规则。然而，大多数模型编辑方法是为事实级知识设计的，假设目标编辑可以通过局部干预实现。这一假设不适用于规则级知识，因为单一规则必须在多个相互依赖的形式上保持一致。我们通过对规则级知识编辑的机制研究来探讨这一问题。为支持该研究，我们将RuleEdit基准从80条手动验证规则扩展到200条，涵盖数学和物理学领域。细粒度因果追踪揭示了Transformer层中规则知识的特定形式组织：公式和描述集中在早期层，而实例则与中间层关联更强。这些结果表明规则知识并非均匀局部化，因此无法通过单一层或连续块干预进行可靠编辑。基于这一发现，我们提出分布式多层编辑方法（DMLE），该方法对公式和描述应用共享的早期层更新，对实例应用独立的中间层更新。在标准编辑指标上保持竞争力的同时，DMLE实现了显著更强的规则级编辑性能。在GPT-J-6B、Qwen2.5-7B、Qwen2-7B和LLaMA-3-8B上，与最强基线相比，它分别将实例可迁移性和规则理解能力平均提升了13.91和50.19个百分点。代码已开源：https://github.com/Pepper66/DMLE。