A Probabilistic Generative Model for Tracking Multi-Knowledge Concept Mastery Probability

Knowledge tracing aims to track students' knowledge status over time to predict students' future performance accurately. Markov chain-based knowledge tracking (MCKT) models can track knowledge concept mastery probability over time. However, as the number of tracked knowledge concepts increases, the time complexity of MCKT predicting student performance increases exponentially (also called explaining away problem. In addition, the existing MCKT models only consider the relationship between students' knowledge status and problems when modeling students' responses but ignore the relationship between knowledge concepts in the same problem. To address these challenges, we propose an inTerpretable pRobAbilistiC gEnerative moDel (TRACED), which can track students' numerous knowledge concepts mastery probabilities over time. To solve \emph{explain away problem}, we design Long and Short-Term Memory (LSTM)-based networks to approximate the posterior distribution, predict students' future performance, and propose a heuristic algorithm to train LSTMs and probabilistic graphical model jointly. To better model students' exercise responses, we proposed a logarithmic linear model with three interactive strategies, which models students' exercise responses by considering the relationship among students' knowledge status, knowledge concept, and problems. We conduct experiments with four real-world datasets in three knowledge-driven tasks. The experimental results show that TRACED outperforms existing knowledge tracing methods in predicting students' future performance and can learn the relationship among students, knowledge concepts, and problems from students' exercise sequences. We also conduct several case studies. The case studies show that TRACED exhibits excellent interpretability and thus has the potential for personalized automatic feedback in the real-world educational environment.

翻译：知识追溯旨在随时间追踪学生的知识状态，以准确预测学生未来的表现。基于马尔可夫链的知识追踪（MCKT）模型能够随时间跟踪知识概念的掌握概率。然而，随着追踪的知识概念数量增加，MCKT预测学生表现的时间复杂度呈指数级增长（也称为“解释消除问题”）。此外，现有MCKT模型在建模学生响应时仅考虑学生知识状态与问题之间的关系，却忽略了同一问题中知识概念间的关联。为解决这些挑战，我们提出了一种可解释的概率生成模型（TRACED），能够随时间追踪学生对大量知识概念的掌握概率。为克服“解释消除问题”，我们设计了基于长短期记忆（LSTM）的网络来近似后验分布、预测学生未来表现，并提出了一个启发式算法来联合训练LSTM与概率图模型。为更好建模学生的练习响应，我们提出了一种包含三种交互策略的对数线性模型，该模型通过综合考虑学生知识状态、知识概念与问题之间的关系来建模学生的练习响应。我们在四个真实世界数据集上进行了三项知识驱动任务的实验。实验结果表明，TRACED在预测学生未来表现方面优于现有知识追踪方法，并能从学生的练习序列中学习学生、知识概念与问题之间的关系。我们还进行了若干案例研究。案例研究表明，TRACED展现出优异的可解释性，因此在实际教育环境中具有应用于个性化自动反馈的潜力。