Testing fairness is a major concern in psychometric and educational research. A typical approach for ensuring testing fairness is through differential item functioning (DIF) analysis. DIF arises when a test item functions differently across subgroups that are typically defined by the respondents' demographic characteristics. Most of the existing research has focused on the statistical detection of DIF, yet less attention has been given to reducing or eliminating DIF and understanding why it occurs. Simultaneously, the use of computer-based assessments has become increasingly popular. The data obtained from respondents interacting with an item are recorded in computer log files and are referred to as process data. Process data provide valuable insights into respondents' problem-solving strategies and progress, offering new opportunities for DIF analysis. In this paper, we propose a novel method within the framework of generalized linear models (GLMs) that leverages process data to reduce and understand DIF. Specifically, we construct a nuisance trait surrogate with the features extracted from process data. With the constructed nuisance trait, we introduce a new scoring rule that incorporates respondents' behaviors captured through process data on top of the target latent trait. We demonstrate the efficiency of our approach through extensive simulation experiments and an application to thirteen Problem Solving in Technology-Rich Environments (PSTRE) items from the 2012 Programme for the International Assessment of Adult Competencies (PIAAC) assessment.
翻译:测试公平性是心理测量学与教育研究中的一个重要关切。确保测试公平性的典型方法是通过差异项目功能(DIF)分析。当某个测试项目在不同亚组(通常根据受访者的人口统计学特征定义)中表现出不同功能时,即出现DIF。现有研究大多集中于DIF的统计检测,而对如何减少或消除DIF及其成因的理解关注较少。与此同时,基于计算机的评估方式日益普及。受访者与项目交互过程中产生的数据被记录在计算机日志文件中,称为过程数据。过程数据为理解受访者的问题解决策略与进展提供了宝贵洞见,为DIF分析带来了新的机遇。本文在广义线性模型(GLMs)框架内提出一种新方法,利用过程数据来减少并理解DIF。具体而言,我们利用从过程数据中提取的特征构建了一个干扰特质代理变量。基于所构建的干扰特质,我们引入了一种新的评分规则,该规则在目标潜在特质的基础上,融入了通过过程数据捕获的受访者行为。我们通过大量模拟实验以及对2012年国际成人能力评估计划(PIAAC)中十三个技术丰富环境问题解决(PSTRE)项目的应用分析,验证了所提方法的有效性。