Dish-TS: A General Paradigm for Alleviating Distribution Shift in Time Series Forecasting

The distribution shift in Time Series Forecasting (TSF), indicating series distribution changes over time, largely hinders the performance of TSF models. Existing works towards distribution shift in time series are mostly limited in the quantification of distribution and, more importantly, overlook the potential shift between lookback and horizon windows. To address above challenges, we systematically summarize the distribution shift in TSF into two categories. Regarding lookback windows as input-space and horizon windows as output-space, there exist (i) intra-space shift, that the distribution within the input-space keeps shifted over time, and (ii) inter-space shift, that the distribution is shifted between input-space and output-space. Then we introduce, Dish-TS, a general neural paradigm for alleviating distribution shift in TSF. Specifically, for better distribution estimation, we propose the coefficient net (CONET), which can be any neural architectures, to map input sequences into learnable distribution coefficients. To relieve intra-space and inter-space shift, we organize Dish-TS as a Dual-CONET framework to separately learn the distribution of input- and output-space, which naturally captures the distribution difference of two spaces. In addition, we introduce a more effective training strategy for intractable CONET learning. Finally, we conduct extensive experiments on several datasets coupled with different state-of-the-art forecasting models. Experimental results show Dish-TS consistently boosts them with a more than 20% average improvement. Code is available.

翻译：时间序列预测中的分布偏移（即序列分布随时间变化的现象）严重制约了预测模型的性能。现有针对时间序列分布偏移的研究大多局限于分布的量化，更重要的是忽略了回溯窗口与预测窗口之间的潜在偏移。为解决上述挑战，我们系统地将时间序列预测中的分布偏移归纳为两类：将回溯窗口视为输入空间、预测窗口视为输出空间时，存在（i）空间内偏移——输入空间内分布随时间持续变化，以及（ii）空间间偏移——输入空间与输出空间之间的分布存在差异。为此，我们提出Dish-TS，一种缓解时间序列预测中分布偏移的通用神经范式。具体而言，为实现更优的分布估计，我们设计系数网络（CONET），该网络可采用任意神经架构，将输入序列映射为可学习的分布系数。为缓解空间内与空间间偏移，我们将Dish-TS组织为双CONET框架，分别学习输入空间与输出空间的分布，从而自然捕捉两个空间的分布差异。此外，我们针对难以训练的CONET引入更有效的训练策略。最后，我们在多个数据集上结合不同先进预测模型进行大量实验。实验结果表明，Dish-TS能持续提升模型性能，平均改进幅度超过20%。代码已开源。