Learning gain differences between ChatGPT and human tutor generated algebra hints

Large Language Models (LLMs), such as ChatGPT, are quickly advancing AI to the frontiers of practical consumer use and leading industries to re-evaluate how they allocate resources for content production. Authoring of open educational resources and hint content within adaptive tutoring systems is labor intensive. Should LLMs like ChatGPT produce educational content on par with human-authored content, the implications would be significant for further scaling of computer tutoring system approaches. In this paper, we conduct the first learning gain evaluation of ChatGPT by comparing the efficacy of its hints with hints authored by human tutors with 77 participants across two algebra topic areas, Elementary Algebra and Intermediate Algebra. We find that 70% of hints produced by ChatGPT passed our manual quality checks and that both human and ChatGPT conditions produced positive learning gains. However, gains were only statistically significant for human tutor created hints. Learning gains from human-created hints were substantially and statistically significantly higher than ChatGPT hints in both topic areas, though ChatGPT participants in the Intermediate Algebra experiment were near ceiling and not even with the control at pre-test. We discuss the limitations of our study and suggest several future directions for the field. Problem and hint content used in the experiment is provided for replicability.

翻译：大型语言模型（LLMs，如ChatGPT）正迅速将人工智能推向实用消费领域的边界，并引领各行业重新评估其在内容生产中的资源分配方式。在自适应辅导系统中，开放教育资源和提示内容的创作高度依赖人力。如果ChatGPT等LLMs能产出与人类作者水平相当的教育内容，将对计算机辅导系统方法的进一步规模化产生深远影响。本文首次通过对比ChatGPT生成的提示与人类导师编写的提示在两项代数主题（基础代数和中级代数）上的教学效果，对77名参与者进行学习收益评估。研究发现，ChatGPT生成的提示中有70%通过了人工质量检查，且人工提示与ChatGPT提示均带来了正向学习收益。然而，仅在人类导师编写的提示条件下，学习收益具有统计学显著性。在两个代数主题领域中，人工提示的学习收益均显著高于ChatGPT提示，尽管中级代数实验中的ChatGPT参与者接近天花板效应，且其前测成绩未与对照组持平。我们讨论了研究的局限性，并为该领域提出了若干未来研究方向。实验中使用的题目与提示内容已公开，供研究复现使用。

相关内容

ChatGPT

关注 258

ChatGPT（全名：Chat Generative Pre-trained Transformer），美国OpenAI 研发的聊天机器人程序 [1] ，于2022年11月30日发布。ChatGPT是人工智能技术驱动的自然语言处理工具，它能够通过学习和理解人类的语言来进行对话，还能根据聊天的上下文进行互动，真正像人类一样来聊天交流，甚至能完成撰写邮件、视频脚本、文案、翻译、代码，写论文任务。 [1] https://openai.com/blog/chatgpt/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

专知会员服务

30+阅读 · 2022年3月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日