Analyzing Process Data from Computer-Based Assessments: A Tutorial on Preprocessing, Feature Extraction, and Model-Based Inference

Computer-based assessments routinely generate detailed interaction logs -- commonly referred to as process data -- that record every action a respondent performs during task completion, yet systematic preprocessing guidance, integrated analytical workflows, and cross-method consistency checks remain scarce in the literature. This paper provides a unified, end-to-end analytical framework for analyzing process data from large-scale assessments -- covering the full pipeline from raw log preprocessing to model-based inference -- using the Programme for the International Assessment of Adult Competencies (PIAAC) Problem Solving in Technology-Rich Environments (PS-TRE) domain as an illustrative example. We first present a systematic preprocessing pipeline -- including timestamp correction, duplicate removal, action block consolidation, and LLM-assisted standardization -- that transforms raw event-level logs into analysis-ready action sequences. We then review and demonstrate two complementary families of analytical methods. The first consists of feature-based methods and their downstream applications, including descriptive process indicators, n-gram analysis with TF--IDF weighting, multidimensional scaling, and process data-informed differential item functioning (DIF) analysis. The second consists of model-based approaches, namely hidden Markov models and the subtask identification procedure. Empirical illustrations using the United States sample illustrate that n-gram-based behavioral clusters carry differential diagnostic information primarily among incorrect respondents, that multidimentionsl scaling-derived features comprehensively reconstruct observed behavioral variables, and that process-informed DIF analyses can identify and mitigate construct-irrelevant sources of group differences. Reproducible R code implementations are provided for all major techniques.

翻译：基于计算机的评估系统会常规生成详细的交互日志——通常称为过程数据——记录被试在完成任务时的每一个操作，然而系统的预处理指导、整合的分析流程以及跨方法的一致性检验在文献中仍较为匮乏。本文以国际成人能力评估项目（PIAAC）技术丰富环境下的问题解决（PS-TRE）领域为例，提供了一个统一的、端到端的大规模评估过程数据分析框架——覆盖从原始日志预处理到基于模型推断的完整流程。我们首先提出系统的预处理流程——包括时间戳校正、重复项删除、操作块合并以及大语言模型辅助标准化——将原始事件级日志转换为可供分析的动作序列。随后，我们梳理并展示了两种互补的分析方法类别。第一类是基于特征的方法及其下游应用，包括描述性过程指标、基于TF-IDF权重的n-gram分析、多维尺度分析以及基于过程数据的差异题功能（DIF）分析。第二类是基于模型的方法，即隐马尔可夫模型与子任务识别程序。基于美国样本的实证分析表明，基于n-gram的行为聚类主要可区分错误作答者的诊断信息差异，多维尺度分析导出的特征能全面重建观察到的行为变量，而基于过程信息的DIF分析能够识别并缓解群体差异中与构念无关的来源。所有主要技术均提供了可复现的R代码实现。