Statistical risk assessments inform consequential decisions, such as pretrial release in criminal justice and loan approvals in consumer finance, by counterfactually predicting an outcome under a proposed decision (e.g., would the applicant default if we approved this loan?). There may, however, have been unmeasured confounders that jointly affected decisions and outcomes in the historical data. We propose a mean outcome sensitivity model that bounds the extent to which unmeasured confounders could affect outcomes on average. The mean outcome sensitivity model partially identifies the conditional likelihood of the outcome under the proposed decision, popular predictive performance metrics, and predictive disparities. We derive their identified sets and develop procedures for the confounding-robust learning and evaluation of statistical risk assessments. We propose a nonparametric regression procedure for the bounds on the conditional likelihood of the outcome under the proposed decision, and estimators for the bounds on predictive performance and disparities. Applying our methods to a real-world credit-scoring task from a large Australian financial institution, we show how varying assumptions on unmeasured confounding lead to substantive changes in the credit score's predictions and evaluations of its predictive disparities.
翻译:统计风险评估通过反事实预测在拟议决策下的结果(例如,若我们批准此贷款,申请人是否会违约),为诸如刑事司法中的审前释放和消费金融中的贷款审批等重大决策提供信息。然而,历史数据中可能存在共同影响决策和结果的未测量混杂因素。我们提出一种平均结果敏感性模型,该模型限制了未测量混杂因素可能对平均结果产生影响的程度。平均结果敏感性模型部分识别了拟议决策下结果的条件似然、常用的预测性能指标以及预测差异。我们推导出它们的识别集,并开发了用于统计风险评估的混杂稳健学习与评估的程序。我们提出了一种非参数回归程序,用于估计拟议决策下结果条件似然的界限,以及用于估计预测性能和差异界限的估计量。通过将我们的方法应用于一家大型澳大利亚金融机构的真实信用评分任务,我们展示了关于未测量混杂的不同假设如何导致信用评分预测及其预测差异评估的实质性变化。