We consider the estimation of measures of model performance in a target population when covariate and outcome data are available on a sample from some source population and covariate data, but not outcome data, are available on a simple random sample from the target population. When outcome data are not available from the target population, identification of measures of model performance is possible under an untestable assumption that the outcome and population (source or target population) are independent conditional on covariates. In practice, this assumption is uncertain and, in some cases, controversial. Therefore, sensitivity analysis may be useful for examining the impact of assumption violations on inferences about model performance. Here, we propose an exponential tilt sensitivity analysis model and develop statistical methods to determine how sensitive measures of model performance are to violations of the assumption of conditional independence between outcome and population. We provide identification results and estimators for the risk in the target population, examine the large-sample properties of the estimators, and apply the estimators to data on individuals with stable ischemic heart disease.
翻译:我们考虑在目标人群中评估模型性能度量的问题,其中协变量和结果数据来自某些源人群的样本,而目标人群中仅有协变量数据的简单随机样本可用(结果数据缺失)。当目标人群中无结果数据时,在不可检验的假设(结果与人群(源人群或目标人群)在协变量条件下独立)下,模型性能度量的识别成为可能。实践中,该假设存在不确定性,有时甚至具有争议性。因此,敏感性分析有助于检验假设违反情况对模型性能推断的影响。本文提出指数倾斜敏感性分析模型,并开发统计方法以确定模型性能度量对结果与人群条件独立性假设违反的敏感程度。我们给出目标人群风险识别的结果与估计量,研究估计量的大样本性质,并将该估计量应用于稳定型缺血性心脏病患者数据。