Computerized adaptive tests (CATs) play a crucial role in educational assessment and diagnostic screening in behavioral health. Unlike traditional linear tests that administer a fixed set of pre-assembled items, CATs adaptively tailor the test to an examinee's latent trait level by selecting a smaller subset of items based on their previous responses. Existing CAT frameworks predominantly rely on item response theory (IRT) models with a single latent variable, a choice driven by both conceptual simplicity and computational feasibility. However, many real-world item response datasets exhibit complex, multi-factor structures, limiting the applicability of CATs in broader settings. In this work, we develop a novel CAT system that incorporates multivariate latent traits, building on recent advances in Bayesian sparse multivariate IRT. Our approach leverages direct sampling from the latent factor posterior distributions, significantly accelerating existing information-theoretic item selection criteria by eliminating the need for computationally intensive Markov Chain Monte Carlo (MCMC) simulations. Recognizing the potential sub-optimality of existing item selection rules, which are often based on myopic one-step-lookahead optimization of some information-theoretic criterion, we propose a double deep Q-learning algorithm to learn an optimal item selection policy. Through simulation and real-data studies, we demonstrate that our approach not only accelerates existing item selection methods but also highlights the potential of reinforcement learning in CATs.
翻译:计算机化自适应测试(CAT)在教育评估和行为健康诊断筛查中发挥着关键作用。与使用固定预组试题的传统线性测试不同,CAT根据考生先前作答情况,通过选择较小的试题子集,自适应地调整测试以适应其潜在特质水平。现有的CAT框架主要依赖于具有单一潜在变量的项目反应理论(IRT)模型,这一选择既出于概念简洁性,也考虑到计算可行性。然而,许多现实世界的项目反应数据集呈现出复杂的多因子结构,限制了CAT在更广泛场景中的适用性。在本研究中,我们基于贝叶斯稀疏多元IRT的最新进展,开发了一种融合多元潜在特质的新型CAT系统。我们的方法利用从潜在因子后验分布中直接采样,通过消除计算密集的马尔可夫链蒙特卡洛(MCMC)模拟需求,显著加速了现有信息论试题选择标准。认识到现有试题选择规则(通常基于某种信息论准则的短视一步前瞻优化)可能存在次优性,我们提出了一种双重深度Q学习算法来学习最优试题选择策略。通过模拟和真实数据研究,我们证明该方法不仅加速了现有试题选择方法,还凸显了强化学习在CAT中的潜力。