Self-supervised learning has been actively studied in time series domain recently, especially for masked reconstruction. Most of these methods follow the "Pre-training + Fine-tuning" paradigm in which a new decoder replaces the pre-trained decoder to fit for a specific downstream task, leading to inconsistency of upstream and downstream tasks. In this paper, we first point out that the unification of task objectives and adaptation for task difficulty are critical for bridging the gap between time series masked reconstruction and forecasting. By reserving the pre-trained mask token during fine-tuning stage, the forecasting task can be taken as a special case of masked reconstruction, where the future values are masked and reconstructed based on history values. It guarantees the consistency of task objectives but there is still a gap in task difficulty. Because masked reconstruction can utilize contextual information while forecasting can only use historical information to reconstruct. To further mitigate the existed gap, we propose a simple yet effective prompt token tuning (PT-Tuning) paradigm, in which all pre-trained parameters are frozen and only a few trainable prompt tokens are added to extended mask tokens in element-wise manner. Extensive experiments on real-world datasets demonstrate the superiority of our proposed paradigm with state-of-the-art performance compared to representation learning and end-to-end supervised forecasting methods.
翻译:自监督学习近年来在时间序列领域得到广泛研究,尤其关注掩码重建。大多数方法遵循"预训练+微调"范式,即用新解码器替换预训练解码器以适应特定下游任务,导致上游与下游任务不一致。本文首先指出,任务目标的统一与任务难度的适配是弥合时间序列掩码重建与预测之间差距的关键。通过在微调阶段保留预训练的掩码令牌,预测任务可被视为掩码重建的一种特殊情况——基于历史值对未来值进行掩码并重建。这保证了任务目标的一致性,但任务难度仍存在差距,因为掩码重建可利用上下文信息,而预测仅能利用历史信息进行重建。为进一步缓解现有差距,我们提出一种简单而有效的提示令牌调谐(PT-Tuning)范式,其中所有预训练参数被冻结,仅以逐元素方式向扩展后的掩码令牌添加少量可训练提示令牌。在真实数据集上的大量实验表明,与表示学习方法和端到端监督预测方法相比,我们提出的范式具有优越性,达到了最先进的性能。