LLMs trained in the understanding of programming syntax are now providing effective assistance to developers and are being used in programming education such as in generation of coding problem examples or providing code explanations. A key aspect of programming education is understanding and dealing with error message. However, 'logical errors' in which the program operates against the programmer's intentions do not receive error messages from the compiler. In this study, building on existing research on programming errors, we first define the types of logical errors that can occur in programming in general. Based on the definition, we propose an effective approach for detecting logical errors with LLMs that makes use of relations among error types in the Chain-of-Thought and Tree-of-Thought prompts. The experimental results indicate that when such logical error descriptions in the prompt are used, the average classifition performance is about 21% higher than the ones without them. We also conducted an experiment for exploiting the relations among errors in generating a new logical error dataset using LLMs. As there is very limited dataset for logical errors such benchmark dataset can be very useful for various programming related applications. We expect that our work can assist novice programmers in identifying the causes of code errors and correct them more effectively.
翻译:经过编程语法理解训练的LLM现正为开发者提供有效支持,并被应用于编程教育领域,例如生成编程问题示例或提供代码解释。编程教育的一个关键环节在于理解与处理错误信息。然而,对于程序运行与程序员意图相悖的“逻辑错误”,编译器通常不会提供错误提示。本研究基于现有编程错误研究,首先系统定义了编程中可能出现的通用逻辑错误类型。在此基础上,我们提出一种利用错误类型间关联关系、结合Chain-of-Thought与Tree-of-Thought提示策略的LLM逻辑错误检测方法。实验结果表明,当提示中包含此类逻辑错误描述时,平均分类性能较未使用提示时提升约21%。此外,我们通过实验探索了利用错误间关联关系,借助LLM生成新型逻辑错误数据集的方法。鉴于当前逻辑错误数据集极为有限,此类基准数据集对各类编程相关应用具有重要价值。我们期望本工作能够帮助编程初学者更有效地识别代码错误成因并进行修正。