The continuous ranked probability score (crps) is the most commonly used scoring rule in the evaluation of probabilistic forecasts for real-valued outcomes. To assess and rank forecasting methods, researchers compute the mean crps over given sets of forecast situations, based on the respective predictive distributions and outcomes. We propose a new, isotonicity-based decomposition of the mean crps into interpretable components that quantify miscalibration (MSC), discrimination ability (DSC), and uncertainty (UNC), respectively. In a detailed theoretical analysis, we compare the new approach to empirical decompositions proposed earlier, generalize to population versions, analyse their properties and relationships, and relate to a hierarchy of notions of calibration. The isotonicity-based decomposition guarantees the nonnegativity of the components and quantifies calibration in a sense that is stronger than for other types of decompositions, subject to the nondegeneracy of empirical decompositions. We illustrate the usage of the isotonicity-based decomposition in case studies from weather prediction and machine learning.
翻译:连续排序概率评分(CRPS)是评估实值结果概率预报时最常用的评分规则。为评价和排序预报方法,研究人员基于给定的预报情景集合对应的预测分布与实际结果,计算CRPS的均值。我们提出一种新的、基于保序性的分解方法,将CRPS均值分解为三个可解释的分量,分别量化误校准(MSC)、判别能力(DSC)和不确定性(UNC)。通过详细的理论分析,我们将新方法与先前提出的经验分解进行比较,推广至总体版本,分析其性质与关系,并将其与一组校准概念层次体系相联系。基于保序性的分解保证了各分量的非负性,并在比其它类型分解更强的意义上量化校准(前提是经验分解的非退化性)。我们通过天气预报与机器学习案例研究,展示了基于保序性分解的应用方法。