Time series forecasting is a critical and challenging task in practical application. Recent advancements in pre-trained foundation models for time series forecasting have gained significant interest. However, current methods often overlook the multi-scale nature of time series, which is essential for accurate forecasting. To address this, we propose HiMTM, a hierarchical multi-scale masked time series modeling with self-distillation for long-term forecasting. HiMTM integrates four key components: (1) hierarchical multi-scale transformer (HMT) to capture temporal information at different scales; (2) decoupled encoder-decoder (DED) that directs the encoder towards feature extraction while the decoder focuses on pretext tasks; (3) hierarchical self-distillation (HSD) for multi-stage feature-level supervision signals during pre-training; and (4) cross-scale attention fine-tuning (CSA-FT) to capture dependencies between different scales for downstream tasks. These components collectively enhance multi-scale feature extraction in masked time series modeling, improving forecasting accuracy. Extensive experiments on seven mainstream datasets show that HiMTM surpasses state-of-the-art self-supervised and end-to-end learning methods by a considerable margin of 3.16-68.54\%. Additionally, HiMTM outperforms the latest robust self-supervised learning method, PatchTST, in cross-domain forecasting by a significant margin of 2.3\%. The effectiveness of HiMTM is further demonstrated through its application in natural gas demand forecasting.
翻译:时间序列预测是实际应用中的关键且具有挑战性的任务。近期,用于时间序列预测的预训练基础模型取得了显著进展并引起了广泛关注。然而,现有方法往往忽略了时间序列的多尺度特性,而这对于准确预测至关重要。为此,我们提出了HiMTM,一种用于长期预测的、基于自蒸馏的分层多尺度掩码时间序列建模方法。HiMTM集成了四个关键组件:(1) 分层多尺度Transformer(HMT),用于捕获不同尺度的时间信息;(2) 解耦编码器-解码器(DED),使编码器专注于特征提取,而解码器专注于预训练任务;(3) 分层自蒸馏(HSD),用于在预训练阶段提供多阶段特征级监督信号;(4) 跨尺度注意力微调(CSA-FT),用于在下游任务中捕获不同尺度间的依赖关系。这些组件共同增强了掩码时间序列建模中的多尺度特征提取能力,从而提高了预测精度。在七个主流数据集上进行的大量实验表明,HiMTM以3.16%至68.54%的显著优势超越了当前最先进的自监督和端到端学习方法。此外,在跨域预测任务中,HiMTM以2.3%的显著优势超越了最新的鲁棒自监督学习方法PatchTST。通过将其应用于天然气需求预测,进一步验证了HiMTM的有效性。