Large pre-trained models excel in zero/few-shot learning for language and vision tasks but face challenges in multivariate time series (TS) forecasting due to diverse data characteristics. Consequently, recent research efforts have focused on developing pre-trained TS forecasting models. These models, whether built from scratch or adapted from large language models (LLMs), excel in zero/few-shot forecasting tasks. However, they are limited by slow performance, high computational demands, and neglect of cross-channel and exogenous correlations. To address this, we introduce Tiny Time Mixers (TTM), a compact model (starting from 1M parameters) with effective transfer learning capabilities, trained exclusively on public TS datasets. TTM, based on the light-weight TSMixer architecture, incorporates innovations like adaptive patching, diverse resolution sampling, and resolution prefix tuning to handle pre-training on varied dataset resolutions with minimal model capacity. Additionally, it employs multi-level modeling to capture channel correlations and infuse exogenous signals during fine-tuning. TTM outperforms existing popular benchmarks in zero/few-shot forecasting by (4-40\%), while reducing computational requirements significantly. Moreover, TTMs are lightweight and can be executed even on CPU-only machines, enhancing usability and fostering wider adoption in resource-constrained environments. Model weights for our initial variant (TTM-Q) are available at https://huggingface.co/ibm-granite/granite-timeseries-ttm-v1. Model weights for more sophisticated variants (TTM-B, TTM-E, and TTM-A) will be shared soon. The source code for TTM can be accessed at https://github.com/ibm-granite/granite-tsfm/tree/main/tsfm_public/models/tinytimemixer.
翻译:大型预训练模型在语言和视觉任务的零/少样本学习中表现出色,但由于多元时间序列数据特征的多样性,其在时间序列预测领域面临挑战。因此,近期研究致力于开发预训练时间序列预测模型。这些模型,无论是从头构建还是基于大型语言模型适配,在零/少样本预测任务中表现优异。然而,它们受限于推理速度慢、计算需求高,且往往忽略跨通道关联与外生变量相关性。为解决这些问题,我们提出了微型时间混合器(TTM),这是一个仅基于公开时间序列数据集训练的紧凑模型(参数量可低至100万),具备高效的迁移学习能力。TTM基于轻量级TSMixer架构,引入了自适应分块、多分辨率采样及分辨率前缀调优等创新技术,以有限模型容量处理不同分辨率数据集的预训练。此外,该模型采用多层次建模以捕捉通道相关性,并在微调阶段融入外生信号。TTM在零/少样本预测任务中优于现有主流基准模型(提升幅度达4-40%),同时显著降低计算需求。更重要的是,TTM模型轻量化,可在仅配备CPU的设备上运行,提升了可用性并促进了在资源受限环境中的广泛采用。我们初始变体(TTM-Q)的模型权重发布于https://huggingface.co/ibm-granite/granite-timeseries-ttm-v1。更复杂的变体(TTM-B、TTM-E与TTM-A)的权重将随后公开。TTM源代码可通过https://github.com/ibm-granite/granite-tsfm/tree/main/tsfm_public/models/tinytimemixer获取。