Code localization is a fundamental challenge in repository-level software engineering tasks such as bug fixing. While existing methods equip language agents with comprehensive tools/interfaces to fetch information from the repository, they overlook the critical aspect of memory, where each instance is typically handled from scratch assuming no prior repository knowledge. In contrast, human developers naturally build long-term repository memory, such as the functionality of key modules and associations between various bug types and their likely fix locations. In this work, we augment language agents with such memory by leveraging a repository's commit history -- a rich yet underutilized resource that chronicles the codebase's evolution. We introduce tools that allow the agent to retrieve from a non-parametric memory encompassing recent historical commits and linked issues, as well as functionality summaries of actively evolving parts of the codebase identified via commit patterns. We demonstrate that augmenting such a memory can significantly improve LocAgent, a state-of-the-art localization framework, on both SWE-bench-verified and the more recent SWE-bench-live benchmarks. Our research contributes towards developing agents that can accumulate and leverage past experience for long-horizon tasks, more closely emulating the expertise of human developers.
翻译:代码定位是仓库级软件工程任务(如缺陷修复)中的一项基础性挑战。现有方法虽然为语言智能体配备了全面的工具/接口以从仓库中获取信息,但它们忽视了记忆这一关键维度——通常每个实例都从零开始处理,假设智能体不具备任何先验的仓库知识。相比之下,人类开发者会自然地构建长期仓库记忆,例如关键模块的功能特性、各类缺陷类型与其可能修复位置之间的关联等。在本工作中,我们通过利用仓库的提交历史——这一记录代码库演进过程、丰富但未得到充分利用的资源——为语言智能体赋予此类记忆能力。我们引入了一系列工具,使智能体能够从非参数化记忆中检索信息,该记忆涵盖近期历史提交记录与关联问题,以及通过提交模式识别出的代码库活跃演化部分的功能摘要。实验表明,增强此类记忆能显著提升当前最先进的定位框架 LocAgent 在 SWE-bench-verified 及更新的 SWE-bench-live 基准测试上的性能。本研究为推动开发能够积累并利用过往经验以处理长期任务的智能体做出了贡献,使其更贴近人类开发者的专业工作模式。