TimeMAE: Self-Supervised Representations of Time Series with Decoupled Masked Autoencoders

Enhancing the expressive capacity of deep learning-based time series models with self-supervised pre-training has become ever-increasingly prevalent in time series classification. Even though numerous efforts have been devoted to developing self-supervised models for time series data, we argue that the current methods are not sufficient to learn optimal time series representations due to solely unidirectional encoding over sparse point-wise input units. In this work, we propose TimeMAE, a novel self-supervised paradigm for learning transferrable time series representations based on transformer networks. The distinct characteristics of the TimeMAE lie in processing each time series into a sequence of non-overlapping sub-series via window-slicing partitioning, followed by random masking strategies over the semantic units of localized sub-series. Such a simple yet effective setting can help us achieve the goal of killing three birds with one stone, i.e., (1) learning enriched contextual representations of time series with a bidirectional encoding scheme; (2) increasing the information density of basic semantic units; (3) efficiently encoding representations of time series using transformer networks. Nevertheless, it is a non-trivial to perform reconstructing task over such a novel formulated modeling paradigm. To solve the discrepancy issue incurred by newly injected masked embeddings, we design a decoupled autoencoder architecture, which learns the representations of visible (unmasked) positions and masked ones with two different encoder modules, respectively. Furthermore, we construct two types of informative targets to accomplish the corresponding pretext tasks. One is to create a tokenizer module that assigns a codeword to each masked region, allowing the masked codeword classification (MCC) task to be completed effectively...

翻译：通过自监督预训练增强基于深度学习的时间序列模型的表达能力，在时间序列分类任务中日益普及。尽管已有大量工作致力于开发针对时间序列数据的自监督模型，我们认为当前方法由于仅对稀疏逐点输入单元进行单向编码，难以学习最优的时间序列表示。本文提出TimeMAE，一种基于Transformer网络的新型自监督范式，用于学习可迁移的时间序列表示。TimeMAE的独特之处在于通过窗口切片分区将每个时间序列处理为一系列不重叠的子序列，随后对局部子序列的语义单元实施随机掩码策略。这种简单而有效的设置有助于实现"一石三鸟"的目标，即：（1）通过双向编码方案学习时间序列的丰富上下文表示；（2）提高基本语义单元的信息密度；（3）利用Transformer网络高效编码时间序列表示。然而，在这种新型建模范式下执行重构任务并非易事。为解决新注入掩码嵌入引发的差异问题，我们设计了一种解耦的自编码器架构，分别使用两个不同的编码器模块学习可见（未掩码）位置和掩码位置的表示。此外，我们构建了两类信息性目标以完成相应的预文本任务。其一是创建分词器模块，为每个掩码区域分配一个码字，从而有效完成掩码码字分类任务。