With the development of large language models (LLMs) in the field of programming, intelligent programming coaching systems have gained widespread attention. However, most research focuses on repairing the buggy code of programming learners without providing the underlying causes of the bugs. To address this gap, we introduce a novel task, namely LRP (Learner-Tailored Program Repair). We then propose a novel and effective framework, LSGEN (Learner-Tailored Solution Generator), to enhance program repair while offering the bug descriptions for the buggy code. In the first stage, we utilize a repair solution retrieval framework to construct a solution retrieval database and then employ an edit-driven code retrieval approach to retrieve valuable solutions, guiding LLMs in identifying and fixing the bugs in buggy code. In the second stage, we propose a solution-guided program repair method, which fixes the code and provides explanations under the guidance of retrieval solutions. Moreover, we propose an Iterative Retrieval Enhancement method that utilizes evaluation results of the generated code to iteratively optimize the retrieval direction and explore more suitable repair strategies, improving performance in practical programming coaching scenarios. The experimental results show that our approach outperforms a set of baselines by a large margin, validating the effectiveness of our framework for the newly proposed LPR task.
翻译:随着大型语言模型(LLM)在编程领域的发展,智能编程辅导系统已获得广泛关注。然而,大多数研究集中于修复编程学习者的错误代码,而未提供错误的根本原因。为弥补这一空白,我们引入了一项新颖的任务,即LRP(学习者定制的程序修复)。随后,我们提出了一种新颖且有效的框架LSGEN(学习者定制的解决方案生成器),以在提供错误代码描述的同时增强程序修复能力。在第一阶段,我们利用修复解决方案检索框架构建解决方案检索数据库,然后采用编辑驱动的代码检索方法来检索有价值的解决方案,引导LLM识别并修复错误代码中的缺陷。在第二阶段,我们提出了一种解决方案引导的程序修复方法,该方法在检索解决方案的指导下修复代码并提供解释。此外,我们提出了一种迭代检索增强方法,该方法利用生成代码的评估结果迭代优化检索方向并探索更合适的修复策略,从而提升实际编程辅导场景中的性能。实验结果表明,我们的方法大幅超越了一系列基线模型,验证了我们针对新提出的LPR任务所提框架的有效性。