Fault Localization (FL) is a critical step in Automated Program Repair (APR), and its importance has increased with the rise of Large Language Model (LLM)-based repair agents. In realistic project-level repair scenarios, software repositories often span millions of tokens, far exceeding current LLM context limits. Consequently, models must first identify a small, relevant subset of code, making accurate FL essential for effective repair. We present a novel project-level FL approach that improves both file- and element-level localization. Our method introduces a hierarchical reasoning module that (i) generates structured, bug-specific explanations for candidate files and elements, and (ii) leverages these explanations in a two-stage ranking scheme combining LLM-based and embedding-based signals. We further propose a counterfactual upper-bound analysis to quantify the contribution of each localization stage to repair success. We evaluate our approach on Python and Java projects from SWE-bench Verified, Lite, and Java. Compared to state-of-the-art baselines, including Agentless and OpenHands, our method consistently improves localization accuracy. On SWE-bench Verified, file-level Hit@1 improves from 71.4% to 85%, and MRR from 81.8% to 88.8%. At the element level, Exact Match under top-3 files increases from 36% to 69%. Integrating our localization into Agentless yields a 12.8% end-to-end repair success improvement.
翻译:故障定位是自动程序修复中的关键步骤,随着基于大语言模型的修复智能体的兴起,其重要性日益凸显。在实际的项目级修复场景中,软件仓库的代码量常达数百万标记,远超当前大语言模型的上下文限制。因此,模型必须首先识别出少量相关的代码子集,这使得精确的故障定位对于有效修复至关重要。本文提出了一种新颖的项目级故障定位方法,该方案同时提升了文件级和元素级的定位性能。我们的方法引入了一个分层推理模块,该模块能够(i)为候选文件和代码元素生成结构化的、针对特定缺陷的解释,并(ii)在一个结合了基于大语言模型和基于嵌入向量的信号的两阶段排序方案中利用这些解释。我们进一步提出了一种反事实上界分析,以量化每个定位阶段对修复成功的贡献。我们在来自SWE-bench Verified、Lite和Java数据集的Python与Java项目上评估了我们的方法。与包括Agentless和OpenHands在内的先进基线方法相比,我们的方法持续提升了定位准确性。在SWE-bench Verified上,文件级Hit@1从71.4%提升至85%,MRR从81.8%提升至88.8%。在元素级,位于前3个文件内的精确匹配率从36%提升至69%。将我们的定位方法集成到Agentless中,实现了12.8%的端到端修复成功率提升。