A Survey on Large Language Models for Code Generation

Large Language Models (LLMs) have garnered remarkable advancements across diverse code-related tasks, known as Code LLMs, particularly in code generation that generates source code with LLM from natural language descriptions. This burgeoning field has captured significant interest from both academic researchers and industry professionals due to its practical significance in software development, e.g., GitHub Copilot. Despite the active exploration of LLMs for a variety of code tasks, either from the perspective of natural language processing (NLP) or software engineering (SE) or both, there is a noticeable absence of a comprehensive and up-to-date literature review dedicated to LLM for code generation. In this survey, we aim to bridge this gap by providing a systematic literature review that serves as a valuable reference for researchers investigating the cutting-edge progress in LLMs for code generation. We introduce a taxonomy to categorize and discuss the recent developments in LLMs for code generation, covering aspects such as data curation, latest advances, performance evaluation, and real-world applications. In addition, we present a historical overview of the evolution of LLMs for code generation and offer an empirical comparison using the widely recognized HumanEval and MBPP benchmarks to highlight the progressive enhancements in LLM capabilities for code generation. We identify critical challenges and promising opportunities regarding the gap between academia and practical development. Furthermore, we have established a dedicated resource website (https://codellm.github.io) to continuously document and disseminate the most recent advances in the field.

翻译：大语言模型（LLMs）在各类代码相关任务中取得了显著进展，此类模型通常被称为代码大语言模型（Code LLMs），尤其是在代码生成领域——即利用大语言模型根据自然语言描述生成源代码。这一新兴领域因其在软件开发中的实际意义（例如GitHub Copilot）而受到学术界研究人员和工业界专业人士的广泛关注。尽管从自然语言处理（NLP）、软件工程（SE）或两者结合的视角，针对大语言模型在各类代码任务中的应用已进行了大量探索，但目前仍缺乏专门针对代码生成大语言模型的全面且最新的文献综述。本综述旨在弥补这一空白，通过提供系统的文献综述，为研究代码生成大语言模型前沿进展的研究者提供有价值的参考。我们提出了一个分类体系，用以归类和讨论代码生成大语言模型的最新发展，涵盖数据整理、最新进展、性能评估和实际应用等方面。此外，我们回顾了代码生成大语言模型的演进历史，并利用广泛认可的人类评估（HumanEval）和MBPP基准测试进行了实证比较，以突显代码生成大语言模型能力的逐步提升。我们指出了当前学术界与实际开发之间存在差距的关键挑战与潜在机遇。同时，我们建立了一个专门的资源网站（https://codellm.github.io），以持续记录和传播该领域的最新进展。

相关内容

大语言模型

关注 66

大语言模型是基于海量文本数据训练的深度学习模型。它不仅能够生成自然语言文本，还能够深入理解文本含义，处理各种自然语言任务，如文本摘要、问答、翻译等。2023年，大语言模型及其在人工智能领域的应用已成为全球科技研究的热点，其在规模上的增长尤为引人注目，参数量已从最初的十几亿跃升到如今的一万亿。参数量的提升使得模型能够更加精细地捕捉人类语言微妙之处，更加深入地理解人类语言的复杂性。在过去的一年里，大语言模型在吸纳新知识、分解复杂任务以及图文对齐等多方面都有显著提升。随着技术的不断成熟，它将不断拓展其应用范围，为人类提供更加智能化和个性化的服务，进一步改善人们的生活和生产方式。

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

专知会员服务

15+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日