Real-world time-series datasets are often multivariate with complex dynamics. To capture this complexity, high capacity architectures like recurrent- or attention-based sequential deep learning models have become popular. However, recent work demonstrates that simple univariate linear models can outperform such deep learning models on several commonly used academic benchmarks. Extending them, in this paper, we investigate the capabilities of linear models for time-series forecasting and present Time-Series Mixer (TSMixer), a novel architecture designed by stacking multi-layer perceptrons (MLPs). TSMixer is based on mixing operations along both the time and feature dimensions to extract information efficiently. On popular academic benchmarks, the simple-to-implement TSMixer is comparable to specialized state-of-the-art models that leverage the inductive biases of specific benchmarks. On the challenging and large scale M5 benchmark, a real-world retail dataset, TSMixer demonstrates superior performance compared to the state-of-the-art alternatives. Our results underline the importance of efficiently utilizing cross-variate and auxiliary information for improving the performance of time series forecasting. We present various analyses to shed light into the capabilities of TSMixer. The design paradigms utilized in TSMixer are expected to open new horizons for deep learning-based time series forecasting.
翻译:现实世界中的多变量时间序列数据集常呈现复杂动态特性。为捕捉这种复杂性,基于循环或注意力机制的序列深度学习模型等高容量架构已得到广泛应用。然而,近期研究表明,在多个常用学术基准测试中,简单的单变量线性模型可超越此类深度学习模型。基于此,本文进一步探究线性模型在时间序列预测中的能力,并提出时间序列混合器(TSMixer)——一种通过堆叠多层感知机(MLP)设计的新型架构。TSMixer沿时间维度和特征维度执行混合操作以高效提取信息。在主流学术基准测试中,易于实现的TSMixer与利用特定基准归纳偏好的专用先进模型性能相当。在具有挑战性的大规模M5基准(真实零售数据集)上,TSMixer展现出优于当前最优替代方案的预测性能。研究结果强调了高效利用跨变量信息与辅助信息对于提升时间序列预测性能的关键作用。通过多维度分析,本文揭示了TSMixer的能力特性。该架构采用的设计范式有望为基于深度学习的时间序列预测开辟新方向。