Pre-trained large language models (LLMs) have recently emerged as a breakthrough technology in natural language processing and artificial intelligence, with the ability to handle large-scale datasets and exhibit remarkable performance across a wide range of tasks. Meanwhile, software testing is a crucial undertaking that serves as a cornerstone for ensuring the quality and reliability of software products. As the scope and complexity of software systems continue to grow, the need for more effective software testing techniques becomes increasingly urgent, making it an area ripe for innovative approaches such as the use of LLMs. This paper provides a comprehensive review of the utilization of LLMs in software testing. It analyzes 102 relevant studies that have used LLMs for software testing, from both the software testing and LLMs perspectives. The paper presents a detailed discussion of the software testing tasks for which LLMs are commonly used, among which test case preparation and program repair are the most representative. It also analyzes the commonly used LLMs, the types of prompt engineering that are employed, as well as the accompanied techniques with these LLMs. It also summarizes the key challenges and potential opportunities in this direction. This work can serve as a roadmap for future research in this area, highlighting potential avenues for exploration, and identifying gaps in our current understanding of the use of LLMs in software testing.
翻译:预训练大语言模型(LLMs)作为自然语言处理和人工智能领域的突破性技术,具备处理大规模数据集的能力,并在众多任务中展现出卓越性能。与此同时,软件测试作为保障软件产品质量与可靠性的基石至关重要。随着软件系统规模与复杂度的持续增长,对更高效软件测试技术的需求愈发迫切,这为LLMs等创新方法的应用提供了理想场景。本文全面综述了LLMs在软件测试领域的应用研究,从软件测试与LLMs双重视角分析了102篇相关文献。论文详细讨论了LLMs常用的软件测试任务(其中测试用例生成与程序修复最具代表性),系统分析了主流LLMs模型、提示工程类型及其配套技术,并总结了该领域的关键挑战与潜在机遇。本研究可为未来相关探索提供路线图,揭示当前对LLMs在软件测试中应用认知的不足,明确潜在的研究方向。