Generative AI is changing the way that many disciplines are taught, including computer science. Researchers have shown that generative AI tools are capable of solving programming problems, writing extensive blocks of code, and explaining complex code in simple terms. Particular promise has been shown in using generative AI to enhance programming error messages. Both students and instructors have complained for decades that these messages are often cryptic and difficult to understand. Yet recent work has shown that students make fewer repeated errors when enhanced via GPT-4. We extend this work by implementing feedback from ChatGPT for all programs submitted to our automated assessment tool, Athene, providing help for compiler, run-time, and logic errors. Our results indicate that adding generative AI to an automated assessment tool does not necessarily make it better and that design of the interface matters greatly to the usability of the feedback that GPT-4 provided.
翻译:生成式AI正在改变包括计算机科学在内的众多学科的教学方式。研究表明,生成式AI工具能够解决编程问题、编写大量代码区块,并用通俗语言解释复杂代码。特别值得关注的是,生成式AI在增强编程错误信息方面展现出显著潜力。数十年来,师生普遍抱怨这类信息往往晦涩难懂。然而近期研究表明,通过GPT-4增强后,学生重复犯错的频率有所降低。我们通过将ChatGPT反馈机制集成到自动化评估工具Athene中,为所有提交程序提供编译错误、运行时错误和逻辑错误的辅助支持,从而拓展了该领域研究。实验结果表明,在自动化评估工具中引入生成式AI未必能提升其性能,界面设计对GPT-4反馈的可用性具有关键影响。