This paper introduces INTERVENOR (INTERactiVE chaiN Of Repair), a system designed to emulate the interactive code repair processes observed in humans, encompassing both code diagnosis and code repair. INTERVENOR prompts Large Language Models (LLMs) to play distinct roles during the code repair process, functioning as both a Code Learner and a Code Teacher. Specifically, the Code Learner is tasked with adhering to instructions to generate or repair code, while the Code Teacher is responsible for crafting a Chain-of-Repair (CoR) to serve as guidance for the Code Learner. During generating the CoR, the Code Teacher needs to check the generated codes from Code Learner and reassess how to address code bugs based on error feedback received from compilers. Experimental results demonstrate that INTERVENOR surpasses baseline models, exhibiting improvements of approximately 18% and 4.3% over GPT-3.5 in code generation and code translation tasks, respectively. Our further analyses show that CoR is effective to illuminate the reasons behind bugs and outline solution plans in natural language. With the feedback of code compilers, INTERVENOR can accurately identify syntax errors and assertion errors and provide precise instructions to repair codes. All data and codes are available at https://github.com/NEUIR/INTERVENOR
翻译:本文介绍了INTERVENOR(交互式修复链)系统,该系统旨在模拟人类交互式代码修复过程,涵盖代码诊断与代码修复两个阶段。INTERVENOR引导大语言模型在代码修复过程中扮演不同角色,同时承担代码学习者与代码教师的职能。具体而言,代码学习者负责遵循指令生成或修复代码,而代码教师则需构建修复链作为代码学习者的指导框架。在生成修复链的过程中,代码教师需检查代码学习者生成的代码,并根据编译器返回的错误反馈重新评估代码缺陷的修复策略。实验结果表明,INTERVENOR在代码生成和代码翻译任务上分别比GPT-3.5提升约18%和4.3%,显著优于基线模型。进一步分析表明,修复链能有效阐明代码缺陷的根源并以自然语言形式规划解决方案。结合编译器反馈,INTERVENOR能够准确定位语法错误与断言错误,并提供精确的代码修复指导。所有数据与代码均已开源:https://github.com/NEUIR/INTERVENOR