We consider the problem of forming prediction sets in an online setting where the distribution generating the data is allowed to vary over time. Previous approaches to this problem suffer from over-weighting historical data and thus may fail to quickly react to the underlying dynamics. Here we correct this issue and develop a novel procedure with provably small regret over all local time intervals of a given width. We achieve this by modifying the adaptive conformal inference (ACI) algorithm of Gibbs and Cand\`{e}s (2021) to contain an additional step in which the step-size parameter of ACI's gradient descent update is tuned over time. Crucially, this means that unlike ACI, which requires knowledge of the rate of change of the data-generating mechanism, our new procedure is adaptive to both the size and type of the distribution shift. Our methods are highly flexible and can be used in combination with any baseline predictive algorithm that produces point estimates or estimated quantiles of the target without the need for distributional assumptions. We test our techniques on two real-world datasets aimed at predicting stock market volatility and COVID-19 case counts and find that they are robust and adaptive to real-world distribution shifts.
翻译:我们考虑在数据生成分布随时间动态变化的在线环境中构建预测集的问题。现有方法因过度加权历史数据,可能难以快速响应潜在动态变化。为此,我们修正该缺陷并提出一种新方法,能在任意给定宽度的局部时间区间内实现可证明的最小遗憾值。该方法通过改进Gibbs与Candès(2021)提出的自适应共形推断(ACI)算法,额外引入一个步骤:动态调节ACI梯度下降更新中的步长参数。关键区别在于,与需要已知数据生成机制变化速率的ACI不同,我们的新方法能自主适应分布漂移的幅度与类型。该框架具有高度灵活性,无需分布假设即可与任意生成点估计或目标分位数的基线预测算法结合使用。我们在预测股市波动性与新冠肺炎病例数的两个真实数据集上验证了该技术,结果表明其对现实分布漂移具有鲁棒性与自适应能力。