Background and Objective: Clinical prediction models are commonly evaluated regarding performance for a population, although decisions are made for individuals. The classic view relates uncertainty in risk estimates for individuals to sample size (estimation uncertainty) while other sources are model uncertainty (variability in modeling choices) and applicability uncertainty (variability in measurement procedures and between populations). We aim to illustrate the uncertainty of prediction models in estimating individual risks with an ovarian cancer example. Methods: We used real and synthetic data for ovarian cancer diagnosis to train 59400 models with variations in estimation, model, and applicability uncertainty. We then used these models to estimate the probability of ovarian cancer in a fixed test set of 100 patients and evaluate the variability in individual estimates. Results: We show empirically that estimation uncertainty can be strongly dominated by model uncertainty and applicability uncertainty, even for models that perform well at the population level. Estimation uncertainty decreased considerably with increasing training sample size, whereas model and applicability uncertainty remained large. Conclusion: Individual risk estimates are far more uncertain than often assumed. Model uncertainty and applicability uncertainty usually remain invisible when prediction models or algorithms are based on a single study. Predictive algorithms should inform, not dictate, care and support personalization through clinician-patient interaction.
翻译:背景与目的:临床预测模型通常在群体层面进行评估,但实际决策需针对个体制定。经典观点将个体风险估计的不确定性归因于样本量(估计不确定性),而其他来源包括模型不确定性(建模选择的变异性)与应用性不确定性(测量流程及群体间的变异性)。本研究拟以卵巢癌为例,阐明预测模型在个体风险估计中的不确定性。方法:利用卵巢癌诊断的真实与合成数据,构建59400个涵盖估计、模型及应用性不确定性的模型。将这些模型应用于固定测试集(100例患者)以估计卵巢癌概率,并评估个体估计值的变异性。结果:实证表明,即使群体层面表现优异的模型,估计不确定性仍可能被模型不确定性和应用性不确定性显著主导。估计不确定性随训练样本量增加而显著降低,而模型不确定性与应用性不确定性仍保持较大。结论:个体风险估计的不确定性远超常规认知。当预测模型或算法基于单一研究时,模型不确定性与应用性不确定性通常被掩盖。预测性算法应辅助而非主导诊疗决策,通过医患互动支持个性化医疗。