Code generation with large language models has shown significant promise, especially when employing retrieval-augmented generation (RAG) with few-shot examples. However, selecting effective examples that enhance generation quality remains a challenging task, particularly when the target programming language (PL) is underrepresented. In this study, we present two key findings: (1) retrieving examples whose presented algorithmic plans can be referenced for generating the desired behavior significantly improves generation accuracy, and (2) converting code into pseudocode effectively captures such algorithmic plans, enhancing retrieval quality even when the source and the target PLs are different. Based on these findings, we propose Plan-as-query Example Retrieval for few-shot prompting in Code generation (PERC), a novel framework that utilizes algorithmic plans to identify and retrieve effective examples. We validate the effectiveness of PERC through extensive experiments on the CodeContests, HumanEval and MultiPL-E benchmarks: PERC consistently outperforms the state-of-the-art RAG methods in code generation, both when the source and target programming languages match or differ, highlighting its adaptability and robustness in diverse coding environments.
翻译:基于大型语言模型的代码生成已展现出显著潜力,尤其是在采用检索增强生成(RAG)结合少量示例时。然而,选择能够有效提升生成质量的示例仍然是一项具有挑战性的任务,特别是当目标编程语言(PL)属于低资源语言时。在本研究中,我们提出了两个关键发现:(1)检索那些所呈现的算法计划可供参考以生成期望行为的示例,能显著提高生成准确性;(2)将代码转换为伪代码能有效捕获此类算法计划,即使在源语言与目标编程语言不同的情况下也能提升检索质量。基于这些发现,我们提出了面向代码生成中少样本提示的以计划为查询的示例检索(PERC),这是一个利用算法计划来识别和检索有效示例的新颖框架。我们在CodeContests、HumanEval和MultiPL-E基准测试上通过大量实验验证了PERC的有效性:无论源语言与目标编程语言是否相同,PERC在代码生成任务中均持续优于当前最先进的RAG方法,突显了其在多样化编码环境中的适应性和鲁棒性。