Automated Program Repair (APR) attempts to patch software bugs and reduce manual debugging efforts. Very recently, with the advances in Large Language Models (LLMs), an increasing number of APR techniques have been proposed, facilitating software development and maintenance and demonstrating remarkable performance. However, due to ongoing explorations in the LLM-based APR field, it is challenging for researchers to understand the current achievements, challenges, and potential opportunities. This work provides the first systematic literature review to summarize the applications of LLMs in APR between 2020 and 2024. We analyze 127 relevant papers from LLMs, APR and their integration perspectives. First, we categorize existing popular LLMs that are applied to support APR and outline three types of utilization strategies for their deployment. Besides, we detail some specific repair scenarios that benefit from LLMs, e.g., semantic bugs and security vulnerabilities. Furthermore, we discuss several critical aspects of integrating LLMs into APR research, e.g., input forms and open science. Finally, we highlight a set of challenges remaining to be investigated and the potential guidelines for future research. Overall, our paper provides a systematic overview of the research landscape to the APR community, helping researchers gain a comprehensive understanding of achievements and promote future research.
翻译:自动化程序修复(APR)致力于修补软件缺陷并减少人工调试工作量。近年来,随着大语言模型(LLMs)的进步,基于LLM的APR技术日益增多,有效推动了软件开发和维护,并展现出卓越性能。然而,由于该领域仍处于持续探索阶段,研究人员难以全面把握当前成果、挑战与潜在机遇。本文首次通过系统文献综述,总结2020至2024年间LLMs在APR中的应用。我们从LLMs、APR及其集成视角分析了127篇相关论文。首先,我们对当前支持APR的主流LLMs进行分类,并概述了三种部署利用策略。其次,详细阐述了受益于LLMs的特定修复场景(如语义错误与安全漏洞)。此外,讨论了将LLMs融入APR研究的若干关键方面(如输入形式与开放科学)。最后,指出了尚待研究的挑战及未来研究的潜在方向。总体而言,本文为APR社区提供了研究全景的系统梳理,助益研究者全面理解现有成果并推动后续研究。