Large language models (LLMs) have significantly advanced autonomous software engineering, leading to a growing number of software engineering agents that assist developers in automatic program repair. Issue localization forms the basis for accurate patch generation. However, because of limitations caused by the context window length of LLMs, existing issue localization methods face challenges in balancing concise yet effective contexts and adequately comprehensive search spaces. In this paper, we introduce CoSIL, an LLM driven, simple yet powerful function level issue localization method without training or indexing. CoSIL reduces the search space through module call graphs, iteratively searches the function call graph to obtain relevant contexts, and uses context pruning to control the search direction and manage contexts effectively. Importantly, the call graph is dynamically constructed by the LLM during search, eliminating the need for pre-parsing. Experiment results demonstrate that CoSIL achieves a Top-1 localization success rate of 43 percent and 44.6 percent on SWE bench Lite and SWE bench Verified, respectively, using Qwen2.5 Coder 32B, outperforming existing methods by 8.6 to 98.2 percent. When CoSIL is applied to guide the patch generation stage, the resolved rate further improves by 9.3 to 31.5 percent.
翻译:大语言模型(LLMs)显著推动了自主软件工程的发展,催生了越来越多协助开发者进行自动程序修复的软件工程智能体。问题定位是生成准确补丁的基础。然而,由于大语言模型上下文窗口长度的限制,现有问题定位方法在平衡简洁有效的上下文与足够全面的搜索空间方面面临挑战。本文提出CoSIL,一种无需训练或索引、由大语言模型驱动的简单而强大的函数级问题定位方法。CoSIL通过模块调用图缩小搜索空间,迭代搜索函数调用图以获取相关上下文,并利用上下文剪枝来控制搜索方向并有效管理上下文。重要的是,调用图是在搜索过程中由大语言模型动态构建的,无需预先解析。实验结果表明,在使用Qwen2.5 Coder 32B模型时,CoSIL在SWE-bench Lite和SWE-bench Verified数据集上分别实现了43%和44.6%的Top-1定位成功率,优于现有方法8.6%至98.2%。当CoSIL被应用于指导补丁生成阶段时,问题解决率进一步提高了9.3%至31.5%。