This paper provides a survey of the emerging area of Large Language Models (LLMs) for Software Engineering (SE). It also sets out open research challenges for the application of LLMs to technical problems faced by software engineers. LLMs' emergent properties bring novelty and creativity with applications right across the spectrum of Software Engineering activities including coding, design, requirements, repair, refactoring, performance improvement, documentation and analytics. However, these very same emergent properties also pose significant technical challenges; we need techniques that can reliably weed out incorrect solutions, such as hallucinations. Our survey reveals the pivotal role that hybrid techniques (traditional SE plus LLMs) have to play in the development and deployment of reliable, efficient and effective LLM-based SE.
翻译:本文对新兴的大语言模型在软件工程领域的应用进行了综述,并梳理了将大语言模型应用于软件工程师面临的技术问题时的开放式研究挑战。大语言模型的涌现特性为软件工程活动(包括编码、设计、需求分析、修复、重构、性能优化、文档编写与分析)带来了创新性与创造力。然而,这些涌现特性本身也引发了显著的技术挑战——我们需要能够可靠地剔除错误方案(如大语言模型幻觉)的技术。本综述揭示了混合技术(传统软件工程与大语言模型的结合)在开发与部署可靠、高效且有效的大语言模型驱动软件工程中所发挥的关键作用。