Analyses of heterogeneous treatment effects (HTE) are common in applied causal inference research. However, when outcomes are latent variables assessed via psychometric instruments such as educational tests, standard methods ignore the potential HTE that may exist among the individual items of the outcome measure. Failing to account for "item-level" HTE (IL-HTE) can lead to both estimated standard errors that are too small and identification challenges in the estimation of treatment-by-covariate interaction effects. We demonstrate how Item Response Theory (IRT) models that estimate a treatment effect for each assessment item can both address these challenges and provide new insights into HTE generally. This study articulates the theoretical rationale for the IL-HTE model and demonstrates its practical value using data from 20 randomized controlled trials containing 2.3 million item responses in economics, education, and health research. Our results show that the IL-HTE model reveals item-level variation masked by average treatment effects, provides more accurate statistical inference, allows for estimates of the generalizability of causal effects, resolves identification problems in the estimation of interaction effects, and provides estimates of standardized treatment effect sizes corrected for attenuation due to measurement error.
翻译:异质性处理效应(HTE)分析在应用因果推断研究中较为常见。然而,当结果变量是通过心理测量工具(如教育测试)评估的潜在变量时,标准方法会忽略结果度量中各个项目间可能存在的潜在HTE。未能考虑"项目级"异质性处理效应(IL-HTE)可能导致估计标准误过小,以及在估计处理-协变量交互效应时出现识别挑战。我们展示了如何通过为每个评估项目估计处理效应的项目反应理论(IRT)模型来应对这些挑战,并为HTE提供新的普遍洞见。本研究阐述了IL-HTE模型的理论基础,并利用包含经济学、教育和健康研究中20项随机对照试验的230万条项目响应数据证明了其实用价值。我们的结果表明,IL-HTE模型能揭示平均处理效应掩盖的项目级变异,提供更准确的统计推断,支持因果效应的泛化性估计,解决交互效应估计中的识别问题,并提供经测量误差衰减校正后的标准化处理效应量估计。