The goal of probabilistic prediction is to issue predictive distributions that are as informative as possible, subject to being calibrated. Despite substantial progress in the univariate setting, achieving multivariate calibration remains challenging. Recent work has introduced pre-rank functions, scalar projections of multivariate forecasts and observations, as flexible diagnostics for assessing specific aspects of multivariate calibration, but their use has largely been limited to post-hoc evaluation. We propose a regularization-based calibration method that enforces multivariate calibration during training of multivariate distributional regression models using pre-rank functions. We further introduce a novel PCA-based pre-rank that projects predictions onto principal directions of the predictive distribution. Through simulation studies and experiments on 18 real-world multi-output regression datasets, we show that the proposed approach substantially improves multivariate pre-rank calibration without compromising predictive accuracy, and that the PCA pre-rank reveals dependence-structure misspecifications that are not detected by existing pre-ranks.
翻译:概率预测的目标是在保证校准性的前提下,提供尽可能信息丰富的预测分布。尽管在单变量设定下已取得显著进展,实现多元校准仍然具有挑战性。近期研究引入了预排序函数——即多元预测与观测的标量投影——作为评估多元校准特定方面的灵活诊断工具,但其应用大多局限于事后评估。本文提出一种基于正则化的校准方法,该方法利用预排序函数在多元分布回归模型训练期间强制实施多元校准。我们进一步引入一种基于主成分分析(PCA)的新型预排序函数,将预测投影至预测分布的主方向上。通过模拟研究及在18个真实世界多输出回归数据集上的实验,我们证明所提方法能在不影响预测准确性的前提下,显著改善多元预排序校准,且PCA预排序能够揭示现有预排序方法无法检测到的依赖结构误设。