Recent advancements in testing differential item functioning (DIF) have greatly relaxed restrictions made by the conventional multiple group item response theory (IRT) model with respect to the number of grouping variables and the assumption of predefined DIF-free anchor items. The application of the $L_1$ penalty in DIF detection has shown promising results in identifying a DIF item without a priori knowledge on anchor items while allowing the simultaneous investigation of multiple grouping variables. The least absolute shrinkage and selection operator (LASSO) is added directly to the loss function to encourage variable sparsity such that DIF parameters of anchor items are penalized to be zero. Therefore, no predefined anchor items are needed. However, DIF detection using LASSO requires a non-trivial model selection consistency assumption and is difficult to draw statistical inference. Given the importance of identifying DIF items in test development, this study aims to apply the decorrelated score test to test DIF once the penalized method is used. Unlike the existing regularized DIF method which is unable to test the statistical significance of a DIF item selected by LASSO, the decorrelated score test requires weaker assumptions and is able to provide asymptotically valid inference to test DIF. Additionally, the deccorrelated score function can be used to construct asymptotically unbiased normal and efficient DIF parameter estimates via a one-step correction. The performance of the proposed decorrelated score test and the one-step estimator are evaluated via a Monte Carlo simulation study.
翻译:近年来,项目功能差异(DIF)检验方法取得了显著进展,极大放宽了传统多组项目反应理论(IRT)模型在分组变量数量和预定义无DIF锚定项目假设方面的限制。在DIF检测中应用$L_1$惩罚项展现出良好效果,能够在无需预先知道锚定项目的同时,同步探索多个分组变量对DIF项目的识别能力。最小绝对收缩与选择算子(LASSO)被直接纳入损失函数以促进变量稀疏性,使得锚定项目的DIF参数被惩罚为零,因此无需预设锚定项目。然而,基于LASSO的DIF检测需要满足非平凡模型选择一致性假设,且难以进行统计推断。鉴于在测验开发中识别DIF项目的重要性,本研究旨在应用去相关得分检验,在采用惩罚方法后检验DIF。与现有正则化DIF方法无法检验LASSO所选DIF项目统计显著性不同,去相关得分检验对假设要求更弱,且能提供渐近有效的DIF检验推断。此外,通过一步修正,去相关得分函数可用于构建渐近无偏正态且高效的DIF参数估计量。通过蒙特卡洛模拟研究评估了所提出的去相关得分检验与一步估计量的性能。