Digital textbook (e-book) systems record student interactions with textbooks as a sequence of events called EventStream data. In the past, researchers extracted meaningful features from EventStream, and utilized them as inputs for downstream tasks such as grade prediction and modeling of student behavior. Previous research evaluated models that mainly used statistical-based features derived from EventStream logs, such as the number of operation types or access frequencies. While these features are useful for providing certain insights, they lack temporal information that captures fine-grained differences in learning behaviors among different students. This study proposes E2Vec, a novel feature representation method based on word embeddings. The proposed method regards operation logs and their time intervals for each student as a string sequence of characters and generates a student vector of learning activity features that incorporates time information. We applied fastText to generate an embedding vector for each of 305 students in a dataset from two years of computer science courses. Then, we investigated the effectiveness of E2Vec in an at-risk detection task, demonstrating potential for generalizability and performance.
翻译:数字教科书(电子书)系统将学生与教科书的交互记录为一系列事件,称为事件流数据。以往研究者从事件流中提取有意义的特征,并将其用作下游任务的输入,例如成绩预测和学生行为建模。已有研究评估了主要基于事件流日志统计特征(如操作类型数量或访问频率)的模型。尽管这些特征有助于提供某些见解,但它们缺乏能够捕捉不同学生学习行为细粒度差异的时间信息。本研究提出E2Vec——一种基于词嵌入的新型特征表示方法。该方法将每位学生的操作日志及其时间间隔视为字符组成的字符串序列,生成融合时间信息的学习活动特征学生向量。我们应用fastText对来自两年计算机科学课程数据集的305名学生生成了嵌入向量。随后,我们在风险评估任务中探究了E2Vec的有效性,证明了其泛化能力和性能潜力。