CodeTailor: LLM-Powered Personalized Parsons Puzzles for Engaging Support While Learning Programming

Learning to program can be challenging, and providing high-quality and timely support at scale is hard. Generative AI and its products, like ChatGPT, can create a solution for most intro-level programming problems. However, students might use these tools to just generate code for them, resulting in reduced engagement and limited learning. In this paper, we present CodeTailor, a system that leverages a large language model (LLM) to provide personalized help to students while still encouraging cognitive engagement. CodeTailor provides a personalized Parsons puzzle to support struggling students. In a Parsons puzzle, students place mixed-up code blocks in the correct order to solve a problem. A technical evaluation with previous incorrect student code snippets demonstrated that CodeTailor could deliver high-quality (correct, personalized, and concise) Parsons puzzles based on their incorrect code. We conducted a within-subjects study with 18 novice programmers. Participants perceived CodeTailor as more engaging than just receiving an LLM-generated solution (the baseline condition). In addition, participants applied more supported elements from the scaffolded practice to the posttest when using CodeTailor than baseline. Overall, most participants preferred using CodeTailor versus just receiving the LLM-generated code for learning. Qualitative observations and interviews also provided evidence for the benefits of CodeTailor, including thinking more about solution construction, fostering continuity in learning, promoting reflection, and boosting confidence. We suggest future design ideas to facilitate active learning opportunities with generative AI techniques.

翻译：编程学习常具挑战性，大规模提供高质量即时支持尤为困难。以ChatGPT为代表的生成式人工智能工具虽能为多数入门级编程问题生成解决方案，但学生可能直接利用其生成代码，导致学习参与度降低与知识掌握受限。本文提出CodeTailor系统，该系统利用大型语言模型（LLM）在提供个性化学习支持的同时，持续促进学生的认知参与。CodeTailor通过生成个性化帕森斯拼图为遇到困难的学生提供支持——在该拼图任务中，学生需将乱序的代码块按正确顺序排列以解决问题。基于历史错误代码片段的技术评估表明，CodeTailor能够根据学生错误代码生成高质量（正确、个性化且简洁）的帕森斯拼图。我们通过对18名编程新手开展组内实验发现：相较于直接接收LLM生成解决方案（基线条件），参与者认为CodeTailor更具吸引力；且使用CodeTailor时，参与者能将更多支架式练习中的支持要素迁移至后测任务。总体而言，多数参与者更倾向于使用CodeTailor而非直接获取LLM生成代码进行学习。定性观察与访谈结果进一步证实了CodeTailor的益处，包括促进解题构建的深度思考、保持学习连续性、激发反思意识及增强学习信心。最后，本文提出未来设计思路，以推动生成式人工智能技术在促进主动学习方面的应用。