This paper provides a survey of the emerging area of Large Language Models (LLMs) for Software Engineering (SE). It also sets out open research challenges for the application of LLMs to technical problems faced by software engineers. LLMs' emergent properties bring novelty and creativity with applications right across the spectrum of Software Engineering activities including coding, design, requirements, repair, refactoring, performance improvement, documentation and analytics. However, these very same emergent properties also pose significant technical challenges; we need techniques that can reliably weed out incorrect solutions, such as hallucinations. Our survey reveals the pivotal role that hybrid techniques (traditional SE plus LLMs) have to play in the development and deployment of reliable, efficient and effective LLM-based SE.
翻译:本文对新兴的“大语言模型在软件工程中的应用”领域进行了综述,并阐述了将大语言模型应用于软件工程师面临的技术问题时所存在的开放研究挑战。大语言模型涌现出的特性在软件工程活动的全谱系中带来了新颖性与创造力,涵盖编码、设计、需求、修复、重构、性能优化、文档和分析等领域。然而,这些相同的涌现特性也带来了显著的技术挑战;我们需要能够可靠地剔除错误解决方案(如幻觉)的技术。本综述揭示了混合技术(传统软件工程加大语言模型)在开发与部署可靠、高效且有效的大语言模型驱动的软件工程中所起的关键作用。