Current methods for analyzing student engagement in e-learning platforms, including automated systems, often struggle with challenges such as handling fuzzy sentiment in text comments and relying on limited metadata. Traditional approaches, such as surveys and questionnaires, also face issues like small sample sizes and scalability. In this paper, we introduce LLM-SEM (Language Model-Based Student Engagement Metric), a novel approach that leverages video metadata and sentiment analysis of student comments to measure engagement. By utilizing recent Large Language Models (LLMs), we generate high-quality sentiment predictions to mitigate text fuzziness and normalize key features such as views and likes. Our holistic method combines comprehensive metadata with sentiment polarity scores to gauge engagement at both the course and lesson levels. Extensive experiments were conducted to evaluate various LLM models, demonstrating the effectiveness of LLM-SEM in providing a scalable and accurate measure of student engagement. We fine-tuned LLMs, including AraBERT, TXLM-RoBERTa, LLama 3B and Gemma 9B from Ollama, using human-annotated sentiment datasets to enhance prediction accuracy.
翻译:当前在线学习平台中分析学生参与度的方法(包括自动化系统)常面临诸多挑战,例如处理文本评论中的模糊情感以及依赖有限的元数据。传统方法(如问卷调查)也存在样本量小和可扩展性不足等问题。本文提出LLM-SEM(基于语言模型的学生参与度度量方法),这是一种利用视频元数据和学生评论情感分析来衡量参与度的新方法。通过运用近期的大型语言模型(LLMs),我们生成高质量的情感预测以缓解文本模糊性问题,并对观看次数、点赞数等关键特征进行归一化处理。我们的整体方法将全面的元数据与情感极性分数相结合,从而在课程和课时两个层面评估参与度。通过大量实验评估了多种LLM模型,证明了LLM-SEM在提供可扩展且准确的学生参与度度量方面的有效性。我们使用人工标注的情感数据集对包括AraBERT、TXLM-RoBERTa、Ollama中的LLama 3B和Gemma 9B在内的LLMs进行了微调,以提升预测准确性。