Bug reports are vital for software maintenance that allow users to inform developers of the problems encountered while using the software. As such, researchers have committed considerable resources toward automating bug replay to expedite the process of software maintenance. Nonetheless, the success of current automated approaches is largely dictated by the characteristics and quality of bug reports, as they are constrained by the limitations of manually-crafted patterns and pre-defined vocabulary lists. Inspired by the success of Large Language Models (LLMs) in natural language understanding, we propose AdbGPT, a new lightweight approach to automatically reproduce the bugs from bug reports through prompt engineering, without any training and hard-coding effort. AdbGPT leverages few-shot learning and chain-of-thought reasoning to elicit human knowledge and logical reasoning from LLMs to accomplish the bug replay in a manner similar to a developer. Our evaluations demonstrate the effectiveness and efficiency of our AdbGPT to reproduce 81.3% of bug reports in 253.6 seconds, outperforming the state-of-the-art baselines and ablation studies. We also conduct a small-scale user study to confirm the usefulness of AdbGPT in enhancing developers' bug replay capabilities.
翻译:缺陷报告对软件维护至关重要,用户可通过它告知开发人员使用软件时遇到的问题。为此,研究者投入大量资源以自动化缺陷复现流程,加速软件维护进程。然而,当前自动化方法的成效在很大程度上受限于缺陷报告的特征与质量,因其依赖于人工设计的模板和预定义词汇表。受大语言模型(LLMs)在自然语言理解领域成功应用的启发,我们提出AdbGPT——一种轻量级方法,通过提示工程自动从缺陷报告中复现缺陷,无需任何训练或硬编码工作。AdbGPT利用少样本学习与思维链推理,从LLMs中提取人类知识与逻辑推理能力,以类似开发者的方式完成缺陷复现。实验评估表明,AdbGPT能以253.6秒的平均时间复现81.3%的缺陷报告,性能优于当前最优基线方法与消融实验。我们还通过小规模用户研究验证了AdbGPT在增强开发者缺陷复现能力方面的实用性。