Although Large Language Models (LLMs) have made significant progress in code generation, they still struggle with code generation tasks in specific scenarios. These scenarios usually necessitate the adaptation of LLMs to fulfill specific needs, but the limited training samples available in practice lead to poor code generation performance. Therefore, how to effectively adapt LLMs to new scenarios with few training samples is a major challenge for current code generation. In this paper, we propose a novel adaptation approach named SEED, which stands for Sample-Efficient adaptation with Error-Driven learning for code generation. SEED leverages the errors made by LLMs as learning opportunities, using error revision to overcome its own shortcomings, thus achieving efficient learning. Specifically, SEED involves identifying error code generated by LLMs, employing Self-revise for code revision, optimizing the model with revised code, and iteratively adapting the process for continuous improvement. Experimental results show that, compared to other mainstream fine-tuning approaches, SEED achieves superior performance with few training samples, showing an average relative improvement of 54.7% in Pass@1 on multiple code generation benchmarks. We also validate the effectiveness of Self-revise, which generates revised code that optimizes the model more efficiently compared to the code samples from datasets. Moreover, SEED consistently demonstrates strong performance across various LLMs, underscoring its generalizability.
翻译:尽管大规模语言模型(LLMs)在代码生成领域取得了显著进展,但在特定场景下的代码生成任务中仍面临挑战。这些场景通常需要适配LLMs以满足特定需求,然而实践中可用的训练样本有限,导致代码生成性能低下。因此,如何利用少量训练样本有效适配LLMs至新场景,是当前代码生成面临的主要挑战。本文提出一种名为SEED(Sample-Efficient adaptation with Error-Driven learning)的新型适配方法,该方法通过将LLMs产生的错误转化为学习契机,利用错误修正克服自身不足,从而实现高效学习。具体而言,SEED包含以下步骤:识别LLMs生成的错误代码、采用Self-revise进行代码修正、利用修正后的代码优化模型,并通过迭代适配过程实现持续改进。实验结果表明,与其他主流微调方法相比,SEED在少量训练样本条件下展现出卓越性能,在多个代码生成基准测试中Pass@1指标平均相对提升54.7%。我们还验证了Self-revise的有效性——相较于数据集中的代码样本,经Self-revise生成的修正代码能更高效地优化模型。此外,SEED在不同LLMs上持续展现出强劲性能,突显其泛化能力。