Probabilistic forecasting is increasingly critical across high-stakes domains, from finance and epidemiology to climate science. However, current evaluation frameworks lack a consensus metric and suffer from two critical flaws: they often assume independence across time steps or variables, and they demonstrably lack sensitivity to tail events, the very occurrences that are most pivotal in real-world decision-making. To address these limitations, we propose two kernel-based metrics: the signature maximum mean discrepancy (Sig-MMD) and our novel censored Sig-MMD (CSig-MMD). By leveraging the signature kernel, these metrics capture complex inter-variate and inter-temporal dependencies and remain robust to missing data. Furthermore, CSig-MMD introduces a censoring scheme that prioritizes a forecaster's capability to predict tail events while strictly maintaining properness, a vital property for a good scoring rule. These metrics enable a more reliable evaluation of direct multi-step forecasting, facilitating the development of more robust probabilistic algorithms.
翻译:概率预测在金融、流行病学和气候科学等高风险领域日益重要。然而,当前的评估框架缺乏共识性度量标准,并存在两个关键缺陷:它们通常假设时间步或变量之间相互独立,且已证明对尾部事件缺乏敏感性,而这些事件在实际决策中恰恰最为关键。为应对这些局限,我们提出了两种基于核的度量:签名最大平均差异(Sig-MMD)和我们新颖的截断Sig-MMD(CSig-MMD)。通过利用签名核,这些度量能够捕捉复杂的变量间与时间依赖性,并对缺失数据保持鲁棒性。此外,CSig-MMD引入了一种截断方案,在严格保持优良评分规则所必需的正确性属性的同时,优先评估预测者对尾部事件的预测能力。这些度量为直接多步预测提供了更可靠的评估,有助于开发更鲁棒的概率算法。