Course evaluation is a critical component in higher education pedagogy. It not only serves to identify limitations in existing course designs and provide a basis for curricular innovation, but also to offer quantitative insights for university administrative decision-making. Traditional evaluation methods, primarily comprising student surveys, instructor self-assessments, and expert reviews, often encounter challenges, including inherent subjectivity, feedback delays, inefficiencies, and limitations in addressing innovative teaching approaches. Recent advancements in large language models (LLMs) within artificial intelligence (AI) present promising new avenues for enhancing course evaluation processes. This study explores the application of LLMs in automated course evaluation from multiple perspectives and conducts rigorous experiments across 100 courses at a major university in China. The findings indicate that: (1) LLMs can be an effective tool for course evaluation; (2) their effectiveness is contingent upon appropriate fine-tuning and prompt engineering; and (3) LLM-generated evaluation results demonstrate a notable level of rationality and interpretability.
翻译:课程评估是高等教育教学法中的关键组成部分。它不仅有助于识别现有课程设计的局限性并为课程创新提供依据,还能为大学行政决策提供量化参考。传统的评估方法主要包括学生问卷调查、教师自评和专家评审,这些方法常面临固有主观性、反馈延迟、效率低下以及在应对创新教学方法方面存在局限等挑战。人工智能领域大语言模型(LLMs)的最新进展为改进课程评估流程提供了前景广阔的新途径。本研究从多角度探讨了LLMs在自动化课程评估中的应用,并在中国一所重点大学的100门课程中进行了严谨的实验。研究结果表明:(1)LLMs可成为课程评估的有效工具;(2)其有效性取决于适当的微调技术和提示工程;(3)LLM生成的评估结果展现出显著程度的合理性与可解释性。