Non-parametric inference on calibration of predicted risks

Moderate calibration, the expected event probability among observations with predicted probability $\pi$ being equal to $\pi$, is a desired property of risk prediction models. Current graphical and numerical techniques for evaluating moderate calibration of clinical prediction models are mostly based on smoothing or grouping the data. As well, there is no widely accepted inferential method for the null hypothesis that a model is moderately calibrated. In this work, we discuss recently-developed, and propose novel, methods for the assessment of moderate calibration for binary responses. The methods are based on the limiting distributions of functions of standardized partial sums of prediction errors converging to the corresponding laws of Brownian motion. The novel method relies on well-known properties of the Brownian bridge which enables joint inference on mean and moderate calibration, leading to a unified 'bridge' test for detecting miscalibration. Simulation studies indicate that the bridge test is more powerful, often substantially, than the alternative test. As a case study we consider a prediction model for short-term mortality after a heart attack. Moderate calibration can be assessed without requiring arbitrary grouping of data or using methods that require tuning of parameters. We suggest graphical presentation of the partial sum curves and reporting the strength of evidence indicated by the proposed methods when examining model calibration.

翻译：中度校准——即预测概率为π的观测中，实际事件概率等于π的期望性质——是风险预测模型理想的特性。当前评估临床预测模型中度的图形与数值技术大多基于数据平滑或分组。此外，针对模型是否具备中度校准的原假设，尚无广泛接受的推断方法。本文探讨了近期开发的用于二分类结果中度校准评估的新方法，并提出了原创性方案。这些方法基于预测误差标准化部分和函数的极限分布收敛至布朗运动相应规律的特性。我们提出的新方法利用了布朗桥的已知性质，可对均值校准与中度校准进行联合推断，从而形成检测校准偏差的统一"桥式"检验。模拟研究表明，桥式检验的统计功效通常显著优于替代检验方法。在案例研究中，我们采用心肌梗死后短期死亡率预测模型进行验证。该方法无需对数据进行任意分组或依赖需参数调整的算法即可进行中度校准评估。我们建议以部分和曲线进行图形化展示，并在检验模型校准时报告所提方法揭示的证据强度。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日

【ICML2021】基于子图结构的GNN解释模型

专知会员服务

50+阅读 · 2021年6月2日

【AAAI2021-斯坦福】身份感知的图神经网络

专知会员服务

39+阅读 · 2021年1月27日