DIF Analysis with Unknown Groups and Anchor Items

Ensuring fairness in instruments like survey questionnaires or educational tests is crucial. One way to address this is by a Differential Item Functioning (DIF) analysis, which examines if different subgroups respond differently to a particular item, controlling for their overall latent construct level. DIF analysis is typically conducted to assess measurement invariance at the item level. Traditional DIF analysis methods require knowing the comparison groups (reference and focal groups) and anchor items (a subset of DIF-free items). Such prior knowledge may not always be available, and psychometric methods have been proposed for DIF analysis when one piece of information is unknown. More specifically, when the comparison groups are unknown while anchor items are known, latent DIF analysis methods have been proposed that estimate the unknown groups by latent classes. When anchor items are unknown while comparison groups are known, methods have also been proposed, typically under a sparsity assumption -- the number of DIF items is not too large. However, DIF analysis when both pieces of information are unknown has not received much attention. This paper proposes a general statistical framework under this setting. In the proposed framework, we model the unknown groups by latent classes and introduce item-specific DIF parameters to capture the DIF effects. Assuming the number of DIF items is relatively small, an $L_1$-regularised estimator is proposed to simultaneously identify the latent classes and the DIF items. A computationally efficient Expectation-Maximisation (EM) algorithm is developed to solve the non-smooth optimisation problem for the regularised estimator. The performance of the proposed method is evaluated by simulation studies and an application to item response data from a real-world educational test.

翻译：确保调查问卷或教育测试等工具公平性至关重要。解决这一问题的方法之一是进行差异项目功能（DIF）分析，该方法通过控制被试的整体潜在特质水平，检验不同子群体对特定项目的反应是否存在差异。DIF分析通常用于评估项目层面的测量不变性。传统DIF分析方法需预先知道比较组（参照组与焦点组）及锚题（无DIF项目的子集）。然而此类先验信息可能无法完全获取，针对部分信息缺失的情况，心理测量学界已提出相应DIF分析方法。具体而言，当比较组未知而锚题已知时，已有潜在DIF分析方法通过潜在类别估计未知群组；当锚题未知而比较组已知时，基于稀疏性假设（DIF项目数量不过大）的方法也被提出。然而，当比较组与锚题两类信息均缺失时，相关研究尚未得到充分关注。本文针对该场景提出一个通用统计框架：在该框架中，我们采用潜在类别对未知群组建模，并引入项目特异性DIF参数以捕捉DIF效应。基于DIF项目数量较少的假设，提出L1正则化估计量以同步识别潜在类别与DIF项目。为求解正则化估计对应的非光滑优化问题，开发了计算高效的期望最大化（EM）算法。通过模拟研究与真实教育测试项目反应数据应用，评估了所提方法的性能。