Bug fixing holds significant importance in software development and maintenance. Recent research has made notable progress in exploring the potential of large language models (LLMs) for automatic bug fixing. However, existing studies often overlook the collaborative nature of bug resolution, treating it as a single-stage process. To overcome this limitation, we introduce a novel stage-wise framework named STEAM in this paper. The objective of STEAM is to simulate the interactive behavior of multiple programmers involved in various stages across the bug's life cycle. Taking inspiration from bug management practices, we decompose the bug fixing task into four distinct stages: bug reporting, bug diagnosis, patch generation, and patch verification. These stages are performed interactively by LLMs, aiming to imitate the collaborative abilities of programmers during the resolution of software bugs. By harnessing the collective contribution, STEAM effectively enhances the bug-fixing capabilities of LLMs. We implement STEAM by employing the powerful dialogue-based LLM -- ChatGPT. Our evaluation on the widely adopted bug-fixing benchmark demonstrates that STEAM has achieved a new state-of-the-art level of bug-fixing performance.
翻译:摘要:缺陷修复在软件开发和维护中具有重要意义。近年来,探索大语言模型在自动缺陷修复中的潜力取得了显著进展。然而,现有研究常忽视缺陷修复的协作本质,将其视为单阶段过程。为突破这一局限,本文提出了一种新颖的分阶段框架——STEAM,其目标是通过模拟多名程序员在缺陷生命周期各阶段的交互行为,提升修复效果。受缺陷管理实践的启发,我们将缺陷修复任务分解为四个不同阶段:缺陷报告、缺陷诊断、补丁生成和补丁验证。这些阶段由大语言模型以交互方式执行,旨在模仿程序员在软件缺陷解决过程中的协作能力。通过整合集体贡献,STEAM显著增强了大语言模型的缺陷修复能力。我们采用基于对话的强大语言模型ChatGPT实现了STEAM,并在广泛采用的缺陷修复基准上进行了评估。结果表明,STEAM达到了新的最优缺陷修复性能水平。