Multivariate distributional forecasts have become widespread in recent years. To assess the quality of such forecasts, suitable evaluation methods are needed. In the univariate case, calibration tests based on the probability integral transform (PIT) are routinely used. However, multivariate extensions of PIT-based calibration tests face various challenges. We therefore introduce a general framework for calibration testing in the multivariate case and propose two new tests that arise from it. Both approaches use proper scoring rules and are simple to implement even in large dimensions. The first employs the PIT of the score. The second is based on comparing the expected performance of the forecast distribution (i.e., the expected score) to its actual performance based on realized observations (i.e., the realized score). The tests have good size and power properties in simulations and solve various problems of existing tests. We apply the new tests to forecast distributions for macroeconomic and financial time series data.
翻译:近年来,多元分布预测已变得普遍。为评估此类预测的质量,需要合适的评估方法。在单变量情况下,基于概率积分变换(PIT)的校准检验已得到常规应用。然而,基于PIT的校准检验在多元扩展时面临诸多挑战。因此,我们提出一个适用于多元情况的校准检验通用框架,并由此衍生出两种新检验方法。两种方法均采用严格评分规则,即便在高维场景下也易于实施。第一种方法使用评分的PIT;第二种方法基于比较预测分布的期望表现(即期望评分)与其基于实际观测的真实表现(即实现评分)。模拟实验表明,这两种检验具有良好的检验规模和检验功效,且解决了现有检验的多种问题。我们将新检验应用于宏观经济与金融时间序列数据的预测分布。