Large language models (LLMs) have shown remarkable progress in automated code generation. Yet, incorporating LLM-based code generation into real-life software projects poses challenges, as the generated code may contain errors in API usage, class, data structure, or missing project-specific information. As much of this project-specific context cannot fit into the prompts of LLMs, we must find ways to allow the model to explore the project-level code context. To this end, this paper puts forward a novel approach, termed ProCoder, which iteratively refines the project-level code context for precise code generation, guided by the compiler feedback. In particular, ProCoder first leverages compiler techniques to identify a mismatch between the generated code and the project's context. It then iteratively aligns and fixes the identified errors using information extracted from the code repository. We integrate ProCoder with two representative LLMs, i.e., GPT-3.5-Turbo and Code Llama (13B), and apply it to Python code generation. Experimental results show that ProCoder significantly improves the vanilla LLMs by over 80% in generating code dependent on project context, and consistently outperforms the existing retrieval-based code generation baselines.
翻译:大语言模型(LLMs)在自动代码生成方面取得了显著进展。然而,将基于LLM的代码生成应用于实际软件项目仍面临挑战,因为生成的代码可能包含API使用、类、数据结构等错误,或缺少项目特定信息。由于此类项目级上下文无法完整输入LLM提示中,我们必须探索允许模型挖掘项目级代码上下文的途径。为此,本文提出一种名为ProCoder的新方法,该方法通过编译器反馈引导,迭代优化项目级代码上下文以实现精确代码生成。具体而言,ProCoder首先利用编译器技术识别生成代码与项目上下文之间的不匹配,随后利用从代码仓库中提取的信息,迭代对齐并修复所识别的错误。我们将ProCoder与两种代表性LLM(即GPT-3.5-Turbo和Code Llama 13B)集成,并应用于Python代码生成。实验结果表明,在生成依赖项目上下文的代码时,ProCoder相比原始LLM提升了超过80%,并且持续优于现有的基于检索的代码生成基线方法。