Large pre-trained models excel in zero/few-shot learning for language and vision tasks but face challenges in multivariate time series (TS) forecasting due to diverse data characteristics. Consequently, recent research efforts have focused on developing pre-trained TS forecasting models. These models, whether built from scratch or adapted from large language models (LLMs), excel in zero/few-shot forecasting tasks. However, they are limited by slow performance, high computational demands, and neglect of cross-channel and exogenous correlations. To address this, we introduce Tiny Time Mixers (TTM), a compact model (starting from 1M parameters) with effective transfer learning capabilities, trained exclusively on public TS datasets. TTM, based on the light-weight TSMixer architecture, incorporates innovations like adaptive patching, diverse resolution sampling, and resolution prefix tuning to handle pre-training on varied dataset resolutions with minimal model capacity. Additionally, it employs multi-level modeling to capture channel correlations and infuse exogenous signals during fine-tuning. TTM outperforms existing popular benchmarks in zero/few-shot forecasting by (4-40%), while reducing computational requirements significantly. Moreover, TTMs are lightweight and can be executed even on CPU-only machines, enhancing usability and fostering wider adoption in resource-constrained environments. The model weights for reproducibility and research use are available at https://huggingface.co/ibm/ttm-research-r2/, while enterprise-use weights under the Apache license can be accessed as follows: the initial TTM-Q variant at https://huggingface.co/ibm-granite/granite-timeseries-ttm-r1, and the latest variants (TTM-B, TTM-E, TTM-A) weights are available at https://huggingface.co/ibm-granite/granite-timeseries-ttm-r2.
翻译:大型预训练模型在语言和视觉任务的零样本/少样本学习中表现出色,但由于多元时间序列数据特征的多样性,其在时间序列预测领域面临挑战。因此,近期研究致力于开发预训练的时间序列预测模型。这些模型,无论是从头构建还是基于大型语言模型适配,在零样本/少样本预测任务中表现优异。然而,它们存在性能缓慢、计算需求高以及忽略跨通道和外生变量相关性等局限。为解决这些问题,我们提出了微型时间混合器(TTM),这是一种仅使用公开时间序列数据集训练的紧凑模型(参数规模从100万起),具备高效的迁移学习能力。TTM基于轻量级TSMixer架构,引入了自适应分块、多分辨率采样和分辨率前缀调优等创新技术,以最小模型容量处理不同分辨率数据集的预训练。此外,该模型采用多层级建模来捕捉通道相关性,并在微调阶段注入外生信号。TTM在零样本/少样本预测任务中,以显著降低的计算需求,性能超越现有主流基准模型(提升4-40%)。同时,TTM模型轻量化,可在仅配备CPU的设备上运行,增强了可用性并促进了在资源受限环境中的广泛采用。模型权重已开源供复现与研究使用:研究用途版本可通过https://huggingface.co/ibm/ttm-research-r2/获取;基于Apache许可证的企业使用版本中,初始TTM-Q变体可通过https://huggingface.co/ibm-granite/granite-timeseries-ttm-r1获取,最新变体(TTM-B、TTM-E、TTM-A)权重可通过https://huggingface.co/ibm-granite/granite-timeseries-ttm-r2获取。