DIF Analysis with Unknown Groups and Anchor Items

Ensuring fairness in instruments like survey questionnaires or educational tests is crucial. One way to address this is by a Differential Item Functioning (DIF) analysis, which examines if different subgroups respond differently to a particular item, controlling for their overall latent construct level. DIF analysis is typically conducted to assess measurement invariance at the item level. Traditional DIF analysis methods require knowing the comparison groups (reference and focal groups) and anchor items (a subset of DIF-free items). Such prior knowledge may not always be available, and psychometric methods have been proposed for DIF analysis when one piece of information is unknown. More specifically, when the comparison groups are unknown while anchor items are known, latent DIF analysis methods have been proposed that estimate the unknown groups by latent classes. When anchor items are unknown while comparison groups are known, methods have also been proposed, typically under a sparsity assumption -- the number of DIF items is not too large. However, DIF analysis when both pieces of information are unknown has not received much attention. This paper proposes a general statistical framework under this setting. In the proposed framework, we model the unknown groups by latent classes and introduce item-specific DIF parameters to capture the DIF effects. Assuming the number of DIF items is relatively small, an $L_1$-regularised estimator is proposed to simultaneously identify the latent classes and the DIF items. A computationally efficient Expectation-Maximisation (EM) algorithm is developed to solve the non-smooth optimisation problem for the regularised estimator. The performance of the proposed method is evaluated by simulation studies and an application to item response data from a real-world educational test.

翻译：确保调查问卷或教育测试等工具的公允性至关重要。实现这一目标的一种方法是通过差异项目功能（DIF）分析，该分析在控制整体潜在特质水平的前提下，检验不同子群体对特定项目的反应是否不同。DIF分析通常用于评估项目层面的测量不变性。传统的DIF分析方法需要已知比较组（参照组与焦点组）和锚定项目（无DIF项目的子集）。然而，此类先验知识并非总是可得，心理测量学界已提出在单一信息未知时进行DIF分析的方法。具体而言，当比较组未知而锚定项目已知时，已提出潜在DIF分析方法，通过潜在类别估计未知组别；当锚定项目未知而比较组已知时，通常基于稀疏性假设（即DIF项目数量较少）提出了相应方法。然而，当两类信息均未知时的DIF分析尚未得到充分研究。本文针对此情形提出一个通用的统计框架。在该框架中，我们通过潜在类别对未知组别进行建模，并引入项目特定的DIF参数以捕捉DIF效应。假设DIF项目数量相对较少，我们提出一种$L_1$正则化估计量，以同时识别潜在类别与DIF项目。为求解正则化估计量对应的非光滑优化问题，我们开发了计算高效的期望最大化（EM）算法。通过模拟研究与实际教育测试项目反应数据的应用，评估了所提方法的性能。