Code generation with large language models has shown significant promise, especially when employing retrieval-augmented generation (RAG) with few-shot examples. However, selecting effective examples that enhance generation quality remains a challenging task, particularly when the target programming language (PL) is underrepresented. In this study, we present two key findings: (1) retrieving examples whose presented algorithmic plans can be referenced for generating the desired behavior significantly improves generation accuracy, and (2) converting code into pseudocode effectively captures such algorithmic plans, enhancing retrieval quality even when the source and the target PLs are different. Based on these findings, we propose Plan-as-query Example Retrieval for few-shot prompting in Code generation (PERC), a novel framework that utilizes algorithmic plans to identify and retrieve effective examples. We validate the effectiveness of PERC through extensive experiments on the CodeContests, HumanEval and MultiPL-E benchmarks: PERC consistently outperforms the state-of-the-art RAG methods in code generation, both when the source and target programming languages match or differ, highlighting its adaptability and robustness in diverse coding environments.
翻译:基于大语言模型的代码生成已展现出巨大潜力,尤其在采用检索增强生成(RAG)结合少量示例时效果显著。然而,如何选择能有效提升生成质量的示例仍是一项挑战性任务,当目标编程语言(PL)属于低资源语言时尤为困难。本研究提出两个关键发现:(1)检索那些所呈现算法计划可供参考以实现目标行为的示例,能显著提升生成准确率;(2)将代码转换为伪代码能有效捕获此类算法计划,即使在源语言与目标编程语言不同时仍能提升检索质量。基于这些发现,我们提出了面向少样本提示代码生成的"计划即查询"示例检索框架(PERC),该创新框架利用算法计划来识别和检索有效示例。我们在CodeContests、HumanEval和MultiPL-E基准测试上通过大量实验验证了PERC的有效性:无论是在源语言与目标编程语言匹配或不匹配的情况下,PERC在代码生成任务中均持续优于当前最先进的RAG方法,凸显了其在不同编码环境中的适应性与鲁棒性。