Item Response Theory (IRT) is a powerful statistical approach for evaluating test items and determining test taker abilities through response analysis. An IRT model that better fits the data leads to more accurate latent trait estimates. In this study, we present a new model for multiple choice data, the monotone multiple choice (MMC) model, which we fit using autoencoders. Using both simulated scenarios and real data from the Swedish Scholastic Aptitude Test, we demonstrate empirically that the MMC model outperforms the traditional nominal response IRT model in terms of fit. Furthermore, we illustrate how the latent trait scale from any fitted IRT model can be transformed into a ratio scale, aiding in score interpretation and making it easier to compare different types of IRT models. We refer to these new scales as bit scales. Bit scales are especially useful for models for which minimal or no assumptions are made for the latent trait scale distributions, such as for the autoencoder fitted models in this study.
翻译:项目反应理论(IRT)是一种通过分析作答反应来评估测试题目和确定应试者能力的强大统计方法。拟合度更高的IRT模型能够产生更准确的潜在特质估计值。本研究提出了一种针对多项选择题数据的新模型——单调多项选择题(MMC)模型,并采用自编码器进行拟合。通过模拟场景和瑞典学业能力测验的实际数据,我们实证证明了MMC模型在拟合优度上优于传统的名义反应IRT模型。此外,我们展示了如何将任何已拟合IRT模型的潜在特质量表转换为比率量表,这有助于分数解释并简化不同类型IRT模型的比较。我们将这些新量表称为比特量表。比特量表特别适用于对潜在特质量表分布做出极少或不做假设的模型,例如本研究中采用自编码器拟合的模型。