Deep learning-based algorithms, e.g., convolutional networks, have significantly facilitated multivariate time series classification (MTSC) task. Nevertheless, they suffer from the limitation in modeling long-range dependence due to the nature of convolution operations. Recent advancements have shown the potential of transformers to capture long-range dependence. However, it would incur severe issues, such as fixed scale representations, temporal-invariant and quadratic time complexity, with transformers directly applicable to the MTSC task because of the distinct properties of time series data. To tackle these issues, we propose FormerTime, an hierarchical representation model for improving the classification capacity for the MTSC task. In the proposed FormerTime, we employ a hierarchical network architecture to perform multi-scale feature maps. Besides, a novel transformer encoder is further designed, in which an efficient temporal reduction attention layer and a well-informed contextual positional encoding generating strategy are developed. To sum up, FormerTime exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism. Extensive experiments performed on $10$ publicly available datasets from UEA archive verify the superiorities of the FormerTime compared to previous competitive baselines.
翻译:基于深度学习的算法(如卷积网络)已显著推动了多变量时间序列分类任务的发展。然而,由于卷积运算的固有特性,这类算法在建模长程依赖关系方面存在局限。近期研究表明,Transformer具有捕捉长程依赖关系的潜力。但受时间序列数据独特属性的影响,将Transformer直接应用于MTSC任务会导致固定尺度表征、时域不变性及二次时间复杂度等严重问题。为解决上述挑战,我们提出FormerTime——一种用于增强MTSC分类性能的分层表征模型。该模型采用分层网络架构实现多尺度特征图,并创新性地设计了改进型Transformer编码器,其中包含高效的时间缩减注意力层与基于充分信息构建的上下文位置编码生成策略。综上所述,FormerTime具有三方面优势:(1)可从时间序列数据中学习分层多尺度表征;(2)兼具Transformer与卷积网络的优势;(3)有效应对自注意力机制带来的效率挑战。在UEA数据库的10个公开数据集上进行的大量实验表明,与现有竞争基线相比,FormerTime展现出显著优越性。