Evaluating forecasts is essential to understand and improve forecasting and make forecasts useful to decision makers. A variety of R packages provide a broad variety of scoring rules, visualisations and diagnostic tools. One particular challenge, which scoringutils aims to address, is handling the complexity of evaluating and comparing forecasts from several forecasters across multiple dimensions such as time, space, and different types of targets. scoringutils extends the existing landscape by offering a convenient and flexible data.table-based framework for evaluating and comparing probabilistic forecasts (forecasts represented by a full predictive distribution). Notably, scoringutils is the first package to offer extensive support for probabilistic forecasts in the form of predictive quantiles, a format that is currently used by several infectious disease Forecast Hubs. The package is easily extendable, meaning that users can supply their own scoring rules or extend existing classes to handle new types of forecasts. scoringutils provides broad functionality to check the data and diagnose issues, to visualise forecasts and missing data, to transform data before scoring, to handle missing forecasts, to aggregate scores, and to visualise the results of the evaluation. The paper presents the package and its core functionality and illustrates common workflows using example data of forecasts for COVID-19 cases and deaths submitted to the European COVID-19 Forecast Hub.
翻译:预测评估对于理解和改进预测、使预测对决策者具有实用价值至关重要。多种R软件包提供了丰富的评分规则、可视化工具和诊断方法。scoringutils旨在应对的一个特定挑战是处理多维度(如时间、空间和不同类型目标)下多个预测者预测结果的评估与比较的复杂性。该软件包通过提供基于data.table的便捷灵活框架,扩展了现有工具生态,专门用于评估和比较概率预测(以完整预测分布表示的预测)。值得注意的是,scoringutils是首个为分位数形式的概率预测提供全面支持的软件包,该格式目前被多个传染病预测中心所采用。该软件包具有良好的可扩展性,用户可自定义评分规则或扩展现有类别以处理新型预测格式。scoringutils提供广泛功能:数据核查与问题诊断、预测与缺失数据可视化、评分前数据转换、缺失预测处理、分数聚合以及评估结果可视化。本文通过欧洲COVID-19预测中心提交的病例与死亡预测示例数据,系统介绍该软件包的核心功能并演示典型工作流程。