The expansion of large-scale online education platforms has made vast amounts of student interaction data available for knowledge tracing (KT). KT models estimate students' concept mastery from interaction data, but their performance is sensitive to input data quality. Gaming behaviors, such as excessive hint use, may misrepresent students' knowledge and undermine model reliability. However, systematic investigations of how different types of gaming behaviors affect KT remain scarce, and existing studies rely on costly manual analysis that does not capture behavioral diversity. In this study, we conceptualize gaming behaviors as a form of data poisoning, defined as the deliberate submission of incorrect or misleading interaction data to corrupt a model's learning process. We design Data Poisoning Attacks (DPAs) to simulate diverse gaming patterns and systematically evaluate their impact on KT model performance. Moreover, drawing on advances in DPA detection, we explore unsupervised approaches to enhance the generalizability of gaming behavior detection. We find that KT models' performance tends to decrease especially in response to random guess behaviors. Our findings provide insights into the vulnerabilities of KT models and highlight the potential of adversarial methods for improving the robustness of learning analytics systems.
翻译:随着大规模在线教育平台的扩展,海量学生交互数据为知识追踪(KT)提供了可能。知识追踪模型通过交互数据评估学生的概念掌握程度,但其性能对输入数据质量高度敏感。游戏化行为(如过度使用提示)可能扭曲学生的知识表征,从而削弱模型的可靠性。然而,关于不同类型游戏化行为如何影响知识追踪的系统性研究仍较为缺乏,现有研究多依赖成本高昂的人工分析,难以捕捉行为多样性。本研究将游戏化行为概念化为一种数据投毒形式,即故意提交错误或误导性交互数据以破坏模型学习过程。我们设计了数据投毒攻击(DPAs)来模拟多样化的游戏行为模式,并系统评估其对知识追踪模型性能的影响。此外,借鉴数据投毒攻击检测的最新进展,我们探索了无监督方法以提升游戏行为检测的泛化能力。研究发现,知识追踪模型的性能在面对随机猜测行为时下降尤为显著。本研究揭示了知识追踪模型的脆弱性,并强调了对抗性方法在提升学习分析系统鲁棒性方面的潜力。