Large Language Models (LLMs) require efficient knowledge editing (KE) to update factual information, yet existing methods exhibit significant performance decay in multi-hop factual recall. This failure is particularly acute when edits involve intermediate implicit subjects within reasoning chains. Through causal analysis, we reveal that this limitation stems from an oversight of how chained knowledge is dynamically represented and utilized at the neuron level. We discover that during multi hop reasoning, implicit subjects function as query neurons, which sequentially activate corresponding value neurons across transformer layers to accumulate information toward the final answer, a dynamic prior KE work has overlooked. Guided by this insight, we propose ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall, a framework that leverages neuron-level attribution to identify and edit these critical query-value (Q-V) pathways. ACE provides a mechanistically grounded solution for multi-hop KE, empirically outperforming state-of-the-art methods by 9.44% on GPT-J and 37.46% on Qwen3-8B. Our analysis further reveals more fine-grained activation patterns in Qwen3 and demonstrates that the semantic interpretability of value neurons is orchestrated by query-driven accumulation. These findings establish a new pathway for advancing KE capabilities based on the principled understanding of internal reasoning mechanisms.
翻译:大型语言模型(LLM)需要高效的知识编辑(KE)来更新事实信息,然而现有方法在多跳事实回溯任务中表现出显著的性能衰退。当编辑涉及推理链中的中间隐含主语时,这种失效尤为严重。通过因果分析,我们发现这一局限源于对链式知识在神经元层面如何动态表征和利用的忽视。我们发现,在多跳推理过程中,隐含主语充当查询神经元,其跨Transformer层依次激活对应的值神经元,从而向最终答案累积信息——这一动态过程是先前KE研究所忽略的。基于此洞见,我们提出ACE:面向多跳事实回溯的归因控制知识编辑,该框架利用神经元级归因来识别并编辑这些关键的查询-值(Q-V)通路。ACE为多跳KE提供了一个基于机理的解决方案,在GPT-J和Qwen3-8B上分别以9.44%和37.46%的优势超越现有最优方法。我们的分析进一步揭示了Qwen3中更细粒度的激活模式,并证明值神经元的语义可解释性是由查询驱动的累积过程所协调的。这些发现为基于内部推理机制原理性理解来推进KE能力开辟了新路径。