Bug reports are vital for software maintenance that allow users to inform developers of the problems encountered while using the software. As such, researchers have committed considerable resources toward automating bug replay to expedite the process of software maintenance. Nonetheless, the success of current automated approaches is largely dictated by the characteristics and quality of bug reports, as they are constrained by the limitations of manually-crafted patterns and pre-defined vocabulary lists. Inspired by the success of Large Language Models (LLMs) in natural language understanding, we propose AdbGPT, a new lightweight approach to automatically reproduce the bugs from bug reports through prompt engineering, without any training and hard-coding effort. AdbGPT leverages few-shot learning and chain-of-thought reasoning to elicit human knowledge and logical reasoning from LLMs to accomplish the bug replay in a manner similar to a developer. Our evaluations demonstrate the effectiveness and efficiency of our AdbGPT to reproduce 81.3% of bug reports in 253.6 seconds, outperforming the state-of-the-art baselines and ablation studies. We also conduct a small-scale user study to confirm the usefulness of AdbGPT in enhancing developers' bug replay capabilities.
翻译:漏洞报告对于软件维护至关重要,它允许用户告知开发人员在使用软件时遇到的问题。因此,研究人员投入了大量资源来自动化漏洞复现,以加速软件维护过程。然而,当前自动化方法的成功很大程度上取决于漏洞报告的特征和质量,因为它们受到手动构建模式和预定义词汇列表的限制。受大语言模型在自然语言理解方面取得成功的启发,我们提出了AdbGPT,一种通过提示工程自动从漏洞报告中复现漏洞的新型轻量级方法,无需任何训练和硬编码工作。AdbGPT利用少样本学习和思维链推理,从大语言模型中激发人类知识和逻辑推理,以类似于开发人员的方式完成漏洞复现。我们的评估证明了AdbGPT的有效性和效率,能够在253.6秒内复现81.3%的漏洞报告,优于最先进的基线和消融研究。我们还进行了小规模用户研究,证实了AdbGPT在增强开发人员漏洞复现能力方面的实用性。