Mutation-based Fault Localization (MBFL) has been widely explored for automated software debugging, leveraging artificial mutants to identify faulty code entities. However, MBFL faces significant challenges due to interference mutants generated from non-faulty code entities but can be killed by failing tests. These mutants mimic the test sensitivity behaviors of real faulty code entities and weaken the effectiveness of fault localization. To address this challenge, we introduce the concept of Fault Localization Interference Mutants (FLIMs) and conduct a theoretical analysis based on the Reachability, Infection, Propagation, and Revealability (RIPR) model, identifying four distinct interference causes. Building on this, we propose a novel approach to semantically recognize and mitigate FLIMs using LLM-based semantic analysis, enhanced by fine-tuning techniques and confidence estimation strategies to address LLM output instability. The recognized FLIMs are then mitigated by refining the suspiciousness scores calculated from MBFL techniques. We integrate FLIM recognition and mitigation into the MBFL workflow, developing MBFL-FLIM, a fault localization framework that enhances MBFL's effectiveness by reducing misleading interference while preserving real fault-revealing information. Our empirical experiments on the Defects4J benchmark with 395 program versions using eight LLMs demonstrate MBFL-FLIM's superiority over traditional SBFL and MBFL methods, advanced dynamic feature-based approaches, and recent LLM-based fault localization techniques. Specifically, MBFL-FLIM achieves an average improvement of 44 faults in the Top-1 metric, representing a significant enhancement over baseline methods. Further evaluation confirms MBFL-FLIM's robust performance in multi-fault scenarios, with ablation experiments validating the contributions of the fine-tuning and confidence estimation components.
翻译:基于变异的故障定位(MBFL)在自动化软件调试中得到广泛研究,其利用人工变异体来识别故障代码实体。然而,MBFL面临重大挑战,主要源于由非故障代码实体生成但能被失败测试杀死的干扰变异体。这些变异体模拟了真实故障代码实体的测试敏感性行为,削弱了故障定位的有效性。为应对这一挑战,我们提出了故障定位干扰变异体(FLIMs)的概念,并基于可达性、感染性、传播性与可揭示性(RIPR)模型进行了理论分析,识别出四种不同的干扰成因。在此基础上,我们提出了一种新颖的方法,利用基于大语言模型(LLM)的语义分析来识别和缓解FLIMs,通过微调技术和置信度估计策略增强以应对LLM输出的不稳定性。识别出的FLIMs随后通过优化MBFL技术计算的可疑度得分得到缓解。我们将FLIM识别与缓解集成到MBFL工作流中,开发了MBFL-FLIM——一种通过减少误导性干扰同时保留真实故障揭示信息来提升MBFL有效性的故障定位框架。我们在Defects4J基准的395个程序版本上使用八种LLM进行的实证实验表明,MBFL-FLIM在性能上优于传统的基于频谱的故障定位(SBFL)和MBFL方法、先进的基于动态特征的方法以及近期基于LLM的故障定位技术。具体而言,MBFL-FLIM在Top-1指标上平均提升了44个故障的定位效果,较基线方法有显著改进。进一步评估证实了MBFL-FLIM在多故障场景下的鲁棒性能,消融实验验证了微调和置信度估计组件的贡献。