The rapid rise of LLMs over the last few years has promoted growing experimentation with LLM-driven AI tutors. However, the details of implementation, as well as the benefit in a teaching environment, are still in the early days of exploration. This article addresses these issues in the context of implementation of an AI Teaching Assistant (AI-TA) using Retrieval Augmented Generation (RAG) for Trinity College Dublin's Master's Motion Picture Engineering (MPE) course. We provide details of our implementation (including the prompt to the LLM, and code), and highlight how we designed and tuned our RAG pipeline to meet course needs. We describe our survey instrument and report on the impact of the AI-TA through a number of quantitative metrics. The scale of our experiment (43 students, 296 sessions, 1,889 queries over 7 weeks) was sufficient to have confidence in our findings. Unlike previous studies, we experimented with allowing the use of the AI-TA in open-book examinations. Statistical analysis across three exams showed no performance differences regardless of AI-TA access (p > 0.05), demonstrating that thoughtfully designed assessments can maintain academic validity. Student feedback revealed that the AI-TA was beneficial (mean = 4.22/5), while students had mixed feelings about preferring it over human tutoring (mean = 2.78/5).
翻译:近几年来,大型语言模型(LLM)的迅猛发展推动了由LLM驱动的人工智能导师的广泛实验。然而,其实施的具体细节以及在教学环境中的益处仍处于早期探索阶段。本文围绕在都柏林圣三一大学影视工程硕士课程中,基于检索增强生成(RAG)技术实现人工智能助教(AI-TA)的实践,探讨了上述问题。我们详细介绍了实现方案(包括面向LLM的提示语及代码),并阐述了如何根据课程需求设计和调整RAG管道。我们描述了调查问卷的设计,并通过多种量化指标报告了AI-TA的影响。实验规模(43名学生、296次会话、7周内1,889次查询)足以确保研究结果的可靠性。与以往研究不同,我们允许在开卷考试中使用AI-TA进行实验。三次考试的统计分析显示,无论是否使用AI-TA,学生表现均无显著差异(p > 0.05),这表明经过精心设计的评估能够维持学术有效性。学生反馈显示,AI-TA具有积极效果(均值=4.22/5),但学生对是否更倾向于人类辅导持有复杂态度(均值=2.78/5)。