In test equating, ensuring score comparability across different test forms is crucial but particularly challenging when test groups are non-equivalent and no anchor test is available. Local test equating aims to satisfy Lord's equity requirement by conditioning equating transformations on individual-level information, typically using anchor test scores as proxies for latent ability. However, anchor tests are not always available in practice. This paper introduces two novel propensity score-based methods for local equating: stratification and inverse probability weighting (IPW). These methods use covariates to account for group differences, with propensity scores serving as proxies for latent ability differences between test groups. The stratification method partitions examinees into comparable groups based on similar propensity scores, while IPW assigns weights inversely proportional to the probability of group membership. We evaluate these methods through empirical analysis and simulation studies. Results indicate both methods can effectively adjust for group differences, with their relative performance depending on the strength of covariate-ability correlations. The study extends local equating methodology to cases where only covariate information is available, providing testing programs with new tools for ensuring fair score comparability.
翻译:在测验等值中,确保不同测验形式之间的分数可比性至关重要,但当测验组不等值且无锚测验可用时尤为困难。本地测验等值旨在通过基于个体层面信息(通常使用锚测验分数作为潜在能力的代理变量)的条件化等值转换,满足洛德公平性要求。然而,实践中锚测验并非总能获得。本文提出了两种基于倾向评分的新型本地等值方法:分层法和逆概率加权法。这两种方法利用协变量解释组间差异,以倾向评分作为测验组间潜在能力差异的代理变量。分层法根据相似倾向评分将被试划分为可比组,而逆概率加权法则赋予与组隶属概率成反比的权重。我们通过实证分析和模拟研究评估了这些方法。结果表明,两种方法均能有效调整组间差异,其相对性能取决于协变量与能力相关性的强度。本研究将本地等值方法论扩展到仅含协变量信息的情形,为测验项目提供了确保分数公平可比性的新工具。