Repository-level code generation with Large Language Models (LLMs) remains challenging, primarily due to complex dependencies and limited context windows. Recent approaches adopt retrieval-augmented generation (RAG) and the planning mechanism to reuse potential callee functions in the repository. However, these approaches often suffer from two limitations: lack of test-driven behavioral guidance during planning and overlooking the implementation logic embedded in repository code during reuse. As a result, generated plans may not align with expected behaviors, and retrieved functions may not be effectively reused. In this paper, we propose TICoder, a novel repository-level code generation framework that improves both planning and reuse. TICoder introduces a test-driven iterative planning mechanism that leverages test cases as behavioral specifications to refine implementation steps. Furthermore, TICoder employs an implementation-aware code reuse strategy, which retrieves potential callee functions using a dual-view similarity that captures both functional and implementation aspects. We then identify relevant usage patterns through a dual-stage selection strategy, combining structure-based clustering and perplexity-based filtering. We conduct extensive experiments on widely used repository-level code generation benchmarks with various LLMs. Experimental results demonstrate that TICoder outperforms state-of-the-art (SOTA) methods, achieving an average improvement of 11.52%.
翻译:大型语言模型在仓库级代码生成任务中仍面临挑战,主要源于复杂依赖关系与有限上下文窗口的限制。现有方法采用检索增强生成与规划机制复用仓库中的潜在被调用函数,但存在两大局限:规划阶段缺乏测试驱动的行为引导,以及复用过程中忽视仓库代码的实现逻辑。这导致生成的规划可能与预期行为不符,且检索到的函数难以有效复用。本文提出TICoder——一种新型仓库级代码生成框架,从规划与复用两个维度进行优化。该框架引入测试驱动迭代规划机制,利用测试用例作为行为规范优化实施方案;同时采用实现感知代码复用策略,通过融合功能相似性与实现相似性的双视角检索获取候选被调用函数,并基于结构聚类与困惑度过滤的两阶段选择策略识别相关使用模式。我们在主流仓库级代码生成基准上使用多种大语言模型进行广泛实验,结果表明TICoder性能优于现有最先进方法,平均提升幅度达11.52%。