Time series analysis is widely used in extensive areas. Recently, to reduce labeling expenses and benefit various tasks, self-supervised pre-training has attracted immense interest. One mainstream paradigm is masked modeling, which successfully pre-trains deep models by learning to reconstruct the masked content based on the unmasked part. However, since the semantic information of time series is mainly contained in temporal variations, the standard way of randomly masking a portion of time points will seriously ruin vital temporal variations of time series, making the reconstruction task too difficult to guide representation learning. We thus present SimMTM, a Simple pre-training framework for Masked Time-series Modeling. By relating masked modeling to manifold learning, SimMTM proposes to recover masked time points by the weighted aggregation of multiple neighbors outside the manifold, which eases the reconstruction task by assembling ruined but complementary temporal variations from multiple masked series. SimMTM further learns to uncover the local structure of the manifold, which is helpful for masked modeling. Experimentally, SimMTM achieves state-of-the-art fine-tuning performance compared to the most advanced time series pre-training methods in two canonical time series analysis tasks: forecasting and classification, covering both in- and cross-domain settings.
翻译:时间序列分析广泛应用于众多领域。近年来,为降低标注成本并惠及各类任务,自监督预训练引发了极大关注。其中一种主流范式是掩码建模,通过基于未掩码部分学习重构被掩码内容,成功预训练了深度模型。然而,由于时间序列的语义信息主要蕴含于时间变化之中,随机掩码部分时间点的标准做法会严重破坏时间序列的关键时间变化,使得重构任务过于困难,难以有效指导表示学习。为此,我们提出SimMTM——一种面向掩码时间序列建模的简单预训练框架。通过将掩码建模与流形学习相关联,SimMTM提出利用流形外多个邻居的加权聚合来恢复被掩码的时间点,从而通过整合来自多个掩码序列中受损但互补的时间变化来简化重构任务。SimMTM进一步学习揭示流形的局部结构,这对掩码建模具有辅助作用。实验结果表明,在两种典型时间序列分析任务(预测与分类)中,涵盖领域内与跨领域设置,SimMTM在与最先进时间序列预训练方法对比时,均取得了最优的微调性能。