This paper introduces INTERVENOR (INTERactiVE chaiN Of Repair), a system designed to emulate the interactive code repair processes observed in humans, encompassing both code diagnosis and code repair. INTERVENOR prompts Large Language Models (LLMs) to play distinct roles during the code repair process, functioning as both a Code Learner and a Code Teacher. Specifically, the Code Learner is tasked with adhering to instructions to generate or repair code, while the Code Teacher is responsible for crafting a Chain-of-Repair (CoR) to serve as guidance for the Code Learner. During generating the CoR, the Code Learner needs to check the generated codes from Code Learner and reassess how to address code bugs based on error feedback received from compilers. Experimental results demonstrate that INTERVENOR surpasses baseline models, exhibiting improvements of approximately 18% and 4.3% over GPT-3.5 in code generation and code translation tasks, respectively. Our further analyses show that CoR is effective to illuminate the reasons behind bugs and outline solution plans in natural language. With the feedback of code compilers, INTERVENOR can accurately identify syntax errors and assertion errors and provide precise instructions to repair codes. All data and codes are available at https://github.com/NEUIR/INTERVENOR
翻译:本文提出INTERVENOR(交互式修复链系统),旨在模拟人类交互式代码修复过程中涉及的代码诊断与修复行为。INTERVENOR引导大语言模型在代码修复过程中扮演不同角色,同时作为代码学习者和代码教师。具体而言,代码学习者负责遵循指令生成或修复代码,而代码教师则负责构建修复链(CoR)作为代码学习者的指导。在生成CoR过程中,代码学习者需检查代码学习者生成的代码,并依据编译器返回的错误反馈重新评估解决代码缺陷的方法。实验结果表明,INTERVENOR在代码生成与代码翻译任务中分别超越基线模型,相较于GPT-3.5提升约18%和4.3%。进一步分析表明,CoR能够有效以自然语言揭示代码缺陷成因并规划解决方案。借助编译器反馈,INTERVENOR可精准识别语法错误与断言错误,并提供精确的代码修复指令。所有数据与代码均已开源发布于https://github.com/NEUIR/INTERVENOR