Mlinear: Rethink the Linear Model for Time-series Forecasting

Recently, significant advancements have been made in time-series forecasting research, with an increasing focus on analyzing the nature of time-series data, e.g, channel-independence (CI) and channel-dependence (CD), rather than solely focusing on designing sophisticated forecasting models. However, current research has primarily focused on either CI or CD in isolation, and the challenge of effectively combining these two opposing properties to achieve a synergistic effect remains an unresolved issue. In this paper, we carefully examine the opposing properties of CI and CD, and raise a practical question that has not been effectively answered, e.g.,"How to effectively mix the CI and CD properties of time series to achieve better predictive performance?" To answer this question, we propose Mlinear (MIX-Linear), a simple yet effective method based mainly on linear layers. The design philosophy of Mlinear mainly includes two aspects:(1) dynamically tuning the CI and CD properties based on the time semantics of different input time series, and (2) providing deep supervision to adjust the individual performance of the "CI predictor" and "CD predictor". In addition, empirically, we introduce a new loss function that significantly outperforms the widely used mean squared error (MSE) on multiple datasets. Experiments on time-series datasets covering multiple fields and widely used have demonstrated the superiority of our method over PatchTST which is the lateset Transformer-based method in terms of the MSE and MAE metrics on 7 datasets with identical sequence inputs (336 or 512). Specifically, our method significantly outperforms PatchTST with a ratio of 21:3 at 336 sequence length input and 29:10 at 512 sequence length input. Additionally, our approach has a 10 $\times$ efficiency advantage at the unit level, taking into account both training and inference times.

翻译：近期，时间序列预测研究取得了显著进展，研究重点逐渐从单纯设计复杂预测模型转向分析时间序列数据的本质特性，例如通道独立性（CI）与通道依赖性（CD）。然而，现有研究通常仅孤立地关注CI或CD特性，如何有效融合这两种对立属性以实现协同效应仍是一个未解决的难题。本文深入审视了CI与CD的对立特性，并提出一个尚未得到有效回答的实践性问题："如何有效混合时间序列的CI与CD特性以获得更优预测性能？"为此，我们提出Mlinear（MIX-Linear）——一种主要基于线性层的简单而有效的方法。Mlinear的设计理念包含两方面：（1）根据输入时间序列的时序语义动态调节CI与CD特性；（2）通过深度监督机制分别优化"CI预测器"与"CD预测器"的个体性能。此外，实验表明，我们引入的新型损失函数在多个数据集上显著优于广泛使用的均方误差（MSE）。在覆盖多领域的常用时间序列数据集上，当输入序列长度相同（336或512）时，我们的方法在7个数据集的MSE和MAE指标上均优于最新的基于Transformer的PatchTST方法。具体而言，在336序列长度输入下，我们的方法以21:3的优势显著优于PatchTST；在512序列长度时，优势比为29:10。此外，我们的方法在考虑训练与推理时间的单位层级上具有10倍效率优势。