Recently the retrieval-augmented generation (RAG) has been successfully applied in code generation. However, existing pipelines for retrieval-augmented code generation (RACG) employ static knowledge bases with a single source, limiting the adaptation capabilities of Large Language Models (LLMs) to domains they have insufficient knowledge of. In this work, we develop a novel pipeline, EVOR, that employs the synchronous evolution of both queries and diverse knowledge bases. On two realistic settings where the external knowledge is required to solve code generation tasks, we compile four new datasets associated with frequently updated libraries and long-tail programming languages, named EVOR-BENCH. Extensive experiments demonstrate that EVOR achieves two to four times of execution accuracy compared to other methods such as Reflexion (Shinn et al., 2024), DocPrompting (Zhou et al., 2023), etc. We demonstrate that EVOR is flexible and can be easily combined with them to achieve further improvement. Further analysis reveals that EVOR benefits from the synchronous evolution of queries and documents and the diverse information sources in the knowledge base. We hope that our studies will inspire more insights into the design of advanced RACG pipelines in future research. Our model, code, and data are available at https://arks-codegen.github.io.
翻译:近年来,检索增强生成(RAG)技术在代码生成领域取得了成功应用。然而,现有的检索增强代码生成(RACG)流程通常采用静态且单一来源的知识库,这限制了大语言模型(LLM)在知识不足领域中的适应能力。本研究提出了一种新颖的流程EVOR,实现了查询与多样化知识库的同步演化。针对需要外部知识解决代码生成任务的两个现实场景,我们构建了四个与频繁更新的库及长尾编程语言相关的新数据集,命名为EVOR-BENCH。大量实验表明,相较于Reflexion(Shinn等人,2024)、DocPrompting(Zhou等人,2023)等方法,EVOR实现了两到四倍的执行准确率提升。我们证明EVOR具有灵活性,可轻松与其他方法结合以取得进一步改进。深入分析表明,EVOR的优势源于查询与文档的同步演化以及知识库中多样化的信息来源。我们希望本研究能为未来高级RACG流程的设计提供更多启发。我们的模型、代码及数据已发布于https://arks-codegen.github.io。