In statistics, forecast uncertainty is often quantified using a specified statistical model, though such approaches may be vulnerable to model misspecification, selection bias, and limited finite-sample validity. While bootstrapping can potentially mitigate some of these concerns, it is often computationally demanding. Instead, we take a model-agnostic and distribution-free approach, namely conformal prediction, to construct prediction intervals in high-dimensional functional time series. Among a rich family of conformal prediction methods, we study split and sequential conformal prediction. In split conformal prediction, the data are divided into training, validation, and test sets, where the validation set is used to select optimal tuning parameters by calibrating empirical coverage probabilities to match nominal levels; after this, prediction intervals are constructed for the test set, and their accuracy is evaluated. In contrast, sequential conformal prediction removes the need for a validation set by updating predictive quantiles sequentially via an autoregressive process. Using subnational age-specific log-mortality data from Japan and Canada, we compare the finite-sample forecast performance of these two conformal methods using empirical coverage probability and the mean interval score.
翻译:在统计学中,预测不确定性通常通过指定的统计模型进行量化,但此类方法易受模型误设、选择偏差和有限样本有效性不足的影响。尽管自助法可能在一定程度上缓解这些问题,但其计算量往往较大。作为替代,我们采用一种与模型无关且无需分布假设的方法——即保形预测——来构建高维函数时间序列的预测区间。在丰富的保形预测方法族中,我们研究了分割保形预测与序列保形预测。在分割保形预测中,数据被划分为训练集、验证集和测试集,其中验证集用于通过校准经验覆盖概率以匹配名义水平来选择最优调参;此后,为测试集构建预测区间,并评估其准确性。相比之下,序列保形预测通过自回归过程顺序更新预测分位数,从而无需验证集。利用来自日本和加拿大的次国家级年龄别对数死亡率数据,我们通过经验覆盖概率和平均区间得分比较了这两种保形预测方法在有限样本下的预测性能。