When programming students encounter errors in their code, compiler messages or static analysis output often provide limited guidance, particularly for novice programmers. Personalized feedback from instructors can be effective but does not scale well. Recent advances in large language models (LLMs) enable automated feedback generation at scale. This study examines whether LLM-generated feedback with different levels of guidance is associated with differences in students' problem-solving behavior. We analyze effects on time to solution and number of attempts, and examine whether these effects differ by programming experience. We design three feedback types and compare them to a baseline in which students receive only compiler error messages. Results from an online programming course show that LLM-generated feedback is associated with faster time to solution compared to the no-feedback baseline, with less guided feedback showing slightly stronger effects. Overall, the findings suggest that feedback structure plays an important role in how students progress toward correct solutions and motivate further work on adaptive feedback designs and longer-term learning outcomes.
翻译:当编程学生在代码中遇到错误时,编译器消息或静态分析输出通常只能提供有限的指导,尤其是对新手程序员而言。来自教师的个性化反馈虽有效,但难以规模化扩展。近年来大型语言模型(LLMs)的进步使得自动化大规模生成反馈成为可能。本研究探讨了不同指导程度的大语言模型生成反馈是否与学生问题解决行为的差异相关。我们分析了这些反馈对解决问题所需时间及尝试次数的影响,并检验这些影响是否因编程经验而异。我们设计了三种反馈类型,并将其与学生仅接收编译器错误消息的基线条件进行对比。来自在线编程课程的结果表明,与无反馈基线相比,大语言模型生成反馈与更快的解题时间相关,其中指导性较弱的反馈表现出略强的影响。总体而言,研究结果表明反馈结构在学生逐步接近正确解决方案的过程中起着重要作用,并推动了关于自适应反馈设计及长期学习成果的进一步研究。