Improving Low-Resource Knowledge Tracing Tasks by Supervised Pre-training and Importance Mechanism Fine-tuning

Knowledge tracing (KT) aims to estimate student's knowledge mastery based on their historical interactions. Recently, the deep learning based KT (DLKT) approaches have achieved impressive performance in the KT task. These DLKT models heavily rely on the large number of available student interactions. However, due to various reasons such as budget constraints and privacy concerns, observed interactions are very limited in many real-world scenarios, a.k.a, low-resource KT datasets. Directly training a DLKT model on a low-resource KT dataset may lead to overfitting and it is difficult to choose the appropriate deep neural architecture. Therefore, in this paper, we propose a low-resource KT framework called LoReKT to address above challenges. Inspired by the prevalent "pre-training and fine-tuning" paradigm, we aim to learn transferable parameters and representations from rich-resource KT datasets during the pre-training stage and subsequently facilitate effective adaptation to low-resource KT datasets. Specifically, we simplify existing sophisticated DLKT model architectures with purely a stack of transformer decoders. We design an encoding mechanism to incorporate student interactions from multiple KT data sources and develop an importance mechanism to prioritize updating parameters with high importance while constraining less important ones during the fine-tuning stage. We evaluate LoReKT on six public KT datasets and experimental results demonstrate the superiority of our approach in terms of AUC and Accuracy. To encourage reproducible research, we make our data and code publicly available at https://anonymous.4open.science/r/LoReKT-C619.

翻译：知识追踪（KT）旨在根据学生的历史交互记录评估其知识掌握程度。近年来，基于深度学习的知识追踪（DLKT）方法在该任务中取得了显著成效。这些DLKT模型高度依赖于大量可用的学生交互数据。然而，由于预算限制和隐私顾虑等多种原因，实际场景中可观测的交互数据往往非常有限，即低资源KT数据集。直接在低资源KT数据集上训练DLKT模型容易导致过拟合，且难以选择合适的深度神经网络架构。为此，本文提出名为LoReKT的低资源KT框架以应对上述挑战。受当前流行的“预训练-微调”范式启发，我们旨在预训练阶段从富资源KT数据集中学习可迁移的参数与表征，进而促进向低资源KT数据集的有效适配。具体而言，我们采用纯Transformer解码器堆叠结构简化现有复杂的DLKT模型架构，设计了一种编码机制以整合多源KT数据中的学生交互信息，并开发了重要性机制以在微调阶段优先更新高重要性参数，同时约束低重要性参数。我们在六个公开KT数据集上评估LoReKT，实验结果表明该方法在AUC和准确率指标上均具有优越性。为促进可重复研究，我们将数据与代码公开于https://anonymous.4open.science/r/LoReKT-C619。