Large language models (LLMs) substantially enhance developer productivity in repository-level code generation through interactive collaboration. However, as interactions progress, repository context must be continuously preserved and updated to integrate newly validated information. Meanwhile, the expanding session history increases cognitive burden, often leading to forgetting and the reintroduction of previously resolved errors. Existing memory management approaches show promise but remain limited by natural language-centric representations. To overcome these limitations, we propose CodeMEM, an AST-guided dynamic memory management system tailored for repository-level iterative code generation. Specifically, CodeMEM introduces the Code Context Memory component that dynamically maintains and updates repository context through AST-guided LLM operations, along with the Code Session Memory that constructs a code-centric representation of interaction history and explicitly detects and mitigates forgetting through AST-based analysis. Experimental results on the instruction-following benchmark CodeIF-Bench and the code generation benchmark CoderEval demonstrate that CodeMEM achieves state-of-the-art performance, improving instruction following by 12.2% for the current turn and 11.5% for the session level, and reducing interaction rounds by 2-3, while maintaining competitive inference latency and token efficiency.
翻译:大型语言模型(LLMs)通过交互式协作显著提升了开发者在仓库级代码生成中的生产力。然而,随着交互过程的推进,必须持续维护和更新仓库上下文以整合新验证的信息。同时,不断扩展的会话历史增加了认知负担,常常导致遗忘和已解决问题错误的重新引入。现有的内存管理方法虽展现出潜力,但仍受限于以自然语言为中心的表示形式。为克服这些局限,我们提出了CodeMEM——一个专为仓库级迭代式代码生成设计的AST引导动态内存管理系统。具体而言,CodeMEM引入了代码上下文内存组件,通过AST引导的LLM操作动态维护和更新仓库上下文;同时设计了代码会话内存组件,构建以代码为中心的交互历史表示,并通过基于AST的分析显式检测与缓解遗忘现象。在指令遵循基准测试CodeIF-Bench和代码生成基准测试CoderEval上的实验结果表明,CodeMEM实现了最先进的性能:当前轮次指令遵循率提升12.2%,会话级指令遵循率提升11.5%,交互轮次减少2-3轮,同时保持具有竞争力的推理延迟与令牌使用效率。