Across domains such as medicine, employment, and criminal justice, predictive models often target labels that imperfectly reflect the outcomes of interest to experts and policymakers. For example, clinical risk assessments deployed to inform physician decision-making often predict measures of healthcare utilization (e.g., costs, hospitalization) as a proxy for patient medical need. These proxies can be subject to outcome measurement error when they systematically differ from the target outcome they are intended to measure. However, prior modeling efforts to characterize and mitigate outcome measurement error overlook the fact that the decision being informed by a model often serves as a risk-mitigating intervention that impacts the target outcome of interest and its recorded proxy. Thus, in these settings, addressing measurement error requires counterfactual modeling of treatment effects on outcomes. In this work, we study intersectional threats to model reliability introduced by outcome measurement error, treatment effects, and selection bias from historical decision-making policies. We develop an unbiased risk minimization method which, given knowledge of proxy measurement error properties, corrects for the combined effects of these challenges. We also develop a method for estimating treatment-dependent measurement error parameters when these are unknown in advance. We demonstrate the utility of our approach theoretically and via experiments on real-world data from randomized controlled trials conducted in healthcare and employment domains. As importantly, we demonstrate that models correcting for outcome measurement error or treatment effects alone suffer from considerable reliability limitations. Our work underscores the importance of considering intersectional threats to model validity during the design and evaluation of predictive models for decision support.
翻译:在医学、就业和刑事司法等领域,预测模型往往针对无法完美反映专家和政策制定者所关注结果的标签。例如,用于辅助临床决策的风险评估模型通常预测医疗资源利用率(如费用、住院率)作为患者医疗需求的代理指标。当这些代理指标与目标结果之间存在系统性差异时,就会产生结果测量误差。然而,现有刻画和缓解结果测量误差的建模方法忽视了被模型辅助的决策往往是缓解风险的干预措施,这些措施会影响目标结果及其记录代理指标。因此,在这些场景中,解决测量误差需要对结果的治疗效应进行反事实建模。本研究探讨了由结果测量误差、治疗效应以及历史决策政策中的选择偏差共同导致的模型可靠性交叉威胁。我们提出了一种无偏风险最小化方法,在已知代理测量误差特性的条件下,能够校正这些挑战的联合效应。针对先验未知的治疗依赖型测量误差参数,我们还开发了相应的估计方法。我们通过理论分析及医疗和就业领域随机对照试验的真实数据实验,证明了该方法的有效性。更重要的是,我们发现仅校正结果测量误差或治疗效应的模型存在显著可靠性缺陷。本研究强调了在设计和评估决策支持预测模型时,必须考虑模型有效性的交叉威胁。