Large Language Models (LLMs) require efficient knowledge editing (KE) to update factual information, yet existing methods exhibit significant performance decay in multi-hop factual recall. This failure is particularly acute when edits involve intermediate implicit subjects within reasoning chains. Through causal analysis, we reveal that this limitation stems from an oversight of how chained knowledge is dynamically represented and utilized at the neuron level. We discover that during multi hop reasoning, implicit subjects function as query neurons, which sequentially activate corresponding value neurons across transformer layers to accumulate information toward the final answer, a dynamic prior KE work has overlooked. Guided by this insight, we propose ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall, a framework that leverages neuron-level attribution to identify and edit these critical query-value (Q-V) pathways. ACE provides a mechanistically grounded solution for multi-hop KE, empirically outperforming state-of-the-art methods by 9.44% on GPT-J and 37.46% on Qwen3-8B. Our analysis further reveals more fine-grained activation patterns in Qwen3 and demonstrates that the semantic interpretability of value neurons is orchestrated by query-driven accumulation. These findings establish a new pathway for advancing KE capabilities based on the principled understanding of internal reasoning mechanisms.
翻译:大型语言模型(LLM)需要高效的知识编辑(KE)来更新事实信息,然而现有方法在多跳事实回溯任务中表现出显著的性能衰减。当编辑涉及推理链中的中间隐式主语时,这种失效尤为严重。通过因果分析,我们发现这一局限源于对链式知识在神经元层面如何动态表征和利用的忽视。我们发现,在多跳推理过程中,隐式主语充当查询神经元,它们顺序激活Transformer各层中对应的值神经元,从而向最终答案累积信息——这一动态过程是先前KE研究所忽略的。基于这一洞见,我们提出了ACE:面向多跳事实回溯的归因控制知识编辑,这是一个利用神经元级归因来识别并编辑这些关键查询-值(Q-V)通路的框架。ACE为多跳KE提供了一个基于机制原理的解决方案,在GPT-J和Qwen3-8B上分别以9.44%和37.46%的优势超越了现有最优方法。我们的分析进一步揭示了Qwen3中更细粒度的激活模式,并证明值神经元的语义可解释性是由查询驱动的累积过程所协调的。这些发现为基于对内部推理机制的原则性理解来推进KE能力,确立了一条新的路径。