Transformer-based models have emerged as powerful tools for multivariate time series forecasting (MTSF). However, existing Transformer models often fall short of capturing both intricate dependencies across variate and temporal dimensions in MTS data. Some recent models are proposed to separately capture variate and temporal dependencies through either two sequential or parallel attention mechanisms. However, these methods cannot directly and explicitly learn the intricate inter-series and intra-series dependencies. In this work, we first demonstrate that these dependencies are very important as they usually exist in real-world data. To directly model these dependencies, we propose a transformer-based model UniTST containing a unified attention mechanism on the flattened patch tokens. Additionally, we add a dispatcher module which reduces the complexity and makes the model feasible for a potentially large number of variates. Although our proposed model employs a simple architecture, it offers compelling performance as shown in our extensive experiments on several datasets for time series forecasting.
翻译:基于Transformer的模型已成为多元时间序列预测(MTSF)的强大工具。然而,现有Transformer模型往往难以同时捕捉多元时间序列数据中跨变量维度与时间维度的复杂依赖关系。近期一些模型提出通过两种顺序或并行的注意力机制分别捕获变量依赖与时间依赖,但这些方法无法直接且显式地学习复杂的序列间与序列内依赖关系。本研究首先论证了这些依赖关系在实际数据中普遍存在且至关重要。为直接建模此类依赖,我们提出一种基于Transformer的模型UniTST,该模型在展平的补丁令牌上采用统一的注意力机制。此外,我们引入调度器模块以降低计算复杂度,使模型能够适应潜在的大规模变量场景。尽管所提模型架构简洁,但在多个时间序列预测数据集上的大量实验表明,其性能表现具有显著竞争力。