Conformal prediction methods enjoy strong theoretical and empirical predictive inference performance, provided the data is exchangeable, and predictors are trained in a memoryless fashion. However, these assumptions and constraints are impractical in many real-data settings, such as time series (where temporal dependence violates exchangeability, and where memoryless predictors will inevitably have poor predictive accuracy). Recent work shows that the split conformal prediction method is robust to these issues of memory-based predictors and deviations from exchangeability that are common features of time-series data. However, since using sample splitting can lead to lower accuracy, this motivates asking whether other predictive inference methods (that do not rely on data splitting) could also be reliably used in the time series setting. In this work, we show that the vanilla leave-one-out jackknife can suffer an arbitrary loss of coverage even in canonical time series models with mild temporal dependence. As a remedy, we propose a careful modification tailored to such settings, which we term the \emph{leave-a-window-out} (LWO) method, and show that it can achieve valid coverage provided that the model-fitting procedure satisfies mild stability properties. Our proofs are based on quantifying the degree to which the data departs from \emph{cyclic exchangeability}, and we introduce new coefficients to measure the extent of this departure. Experiments on time series data demonstrate that our LWO method often enjoys valid coverage when the vanilla jackknife fails to cover, while producing much narrower intervals than split conformal prediction.
翻译:共形预测方法在数据可交换且预测器以无记忆方式训练的条件下,享有强大的理论和经验预测推断性能。然而,在许多实际数据场景(如时间序列)中,这些假设和约束并不实用(时间依赖性破坏了可交换性,且无记忆预测器必然导致预测精度低下)。近期研究表明,分裂共形预测方法对基于记忆的预测器以及时间序列数据中常见偏离可交换性的问题具有鲁棒性。但由于使用样本分裂可能导致精度降低,这自然引发一个疑问:其他不依赖数据分裂的预测推断方法是否也能可靠地应用于时间序列场景?本文证明,即使在具有轻度时间依赖性的典型时间序列模型中,原始留一法刀切法也可能遭受任意程度的覆盖损失。作为解决方案,我们提出一种针对此类场景的谨慎修正方法,称之为"留一窗口法"(LWO),并证明当模型拟合过程满足温和稳定性条件时,该方法能实现有效覆盖。我们的证明基于量化数据偏离"循环可交换性"的程度,并引入新系数衡量这种偏离程度。时间序列数据上的实验表明,当原始刀切法失效时,我们的LWO方法通常能实现有效覆盖,同时生成的预测区间比分裂共形预测法窄得多。