Beyond the Training Distribution: Evaluating Predictions Under Distribution Shift and Selection Bias

Understanding how a prediction model will perform in a new environment before deployment is essential to preventing harm when algorithms inform decision-making. Two common sources of model performance degradation are (i) covariate shift, where the target covariate distribution differs from the source, and (ii) selective labels, where the observability of outcomes depends on historical decisions. We study pre-deployment model evaluation under the joint presence of covariate shift and labeling of outcomes selectively based on observed features. In particular, we present a double machine learning procedure for estimating the target risk of an arbitrary black-box prediction model under a general loss function. We show identification of this estimand under standard assumptions and derive a bias-corrected estimator based on the influence function of the target risk. Finally, we evaluate our estimator through experiments using the eICU electronic health records database, showing that it tracks the true target risk more accurately than methods that address either selective labels or covariate shift alone, as well as baselines that combine standard plug-in approaches.

翻译：理解预测模型在部署前于新环境中的表现，对于防止算法辅助决策时造成危害至关重要。导致模型性能下降的两个常见原因是：（i）协变量偏移，即目标协变量分布与源分布不同；（ii）选择性标签，即结果的可观测性取决于历史决策。我们研究在协变量偏移和基于观测特征选择性标记结果共同存在下的部署前模型评估问题。具体而言，我们提出了一种双重机器学习程序，用于在一般损失函数下估计任意黑箱预测模型的目标风险。我们在标准假设下证明了该估计量的可识别性，并基于目标风险的影响函数推导出偏差校正估计量。最后，我们利用eICU电子健康记录数据库通过实验评估该估计量，结果表明，相较于仅处理选择性标签或协变量偏移的方法，以及结合标准插件方法的基线，该估计量能更准确地追踪真实目标风险。

相关内容

MoDELS

关注 46

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

深度图学习在分布偏移下的综述：从图的分布外泛化到自适应

专知会员服务

18+阅读 · 2024年10月28日

【牛津大学博士论文】学习分布不确定性估计的语义分割，191页pdf

专知会员服务

30+阅读 · 2024年7月31日

《分布外泛化评估》综述

专知会员服务

44+阅读 · 2024年3月6日

【CMU博士论文】分布偏移下的不确定性量化，226页pdf

专知会员服务

31+阅读 · 2023年9月30日