Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series

Large pre-trained models excel in zero/few-shot learning for language and vision tasks but face challenges in multivariate time series (TS) forecasting due to diverse data characteristics. Consequently, recent research efforts have focused on developing pre-trained TS forecasting models. These models, whether built from scratch or adapted from large language models (LLMs), excel in zero/few-shot forecasting tasks. However, they are limited by slow performance, high computational demands, and neglect of cross-channel and exogenous correlations. To address this, we introduce Tiny Time Mixers (TTM), a compact model (starting from 1M parameters) with effective transfer learning capabilities, trained exclusively on public TS datasets. TTM, based on the light-weight TSMixer architecture, incorporates innovations like adaptive patching, diverse resolution sampling, and resolution prefix tuning to handle pre-training on varied dataset resolutions with minimal model capacity. Additionally, it employs multi-level modeling to capture channel correlations and infuse exogenous signals during fine-tuning. TTM outperforms existing popular benchmarks in zero/few-shot forecasting by (4-40%), while reducing computational requirements significantly. Moreover, TTMs are lightweight and can be executed even on CPU-only machines, enhancing usability and fostering wider adoption in resource-constrained environments. The model weights for reproducibility and research use are available at https://huggingface.co/ibm/ttm-research-r2/, while enterprise-use weights under the Apache license can be accessed as follows: the initial TTM-Q variant at https://huggingface.co/ibm-granite/granite-timeseries-ttm-r1, and the latest variants (TTM-B, TTM-E, TTM-A) weights are available at https://huggingface.co/ibm-granite/granite-timeseries-ttm-r2.

翻译：大型预训练模型在语言和视觉任务的零样本/少样本学习中表现出色，但由于多元时间序列数据特征的多样性，其在时间序列预测领域面临挑战。因此，近期研究致力于开发预训练的时间序列预测模型。这些模型，无论是从头构建还是基于大型语言模型适配，在零样本/少样本预测任务中表现优异。然而，它们存在性能缓慢、计算需求高以及忽略跨通道和外生变量相关性等局限。为解决这些问题，我们提出了微型时间混合器（TTM），这是一种仅使用公开时间序列数据集训练的紧凑模型（参数规模从100万起），具备高效的迁移学习能力。TTM基于轻量级TSMixer架构，引入了自适应分块、多分辨率采样和分辨率前缀调优等创新技术，以最小模型容量处理不同分辨率数据集的预训练。此外，该模型采用多层级建模来捕捉通道相关性，并在微调阶段注入外生信号。TTM在零样本/少样本预测任务中，以显著降低的计算需求，性能超越现有主流基准模型（提升4-40%）。同时，TTM模型轻量化，可在仅配备CPU的设备上运行，增强了可用性并促进了在资源受限环境中的广泛采用。模型权重已开源供复现与研究使用：研究用途版本可通过https://huggingface.co/ibm/ttm-research-r2/获取；基于Apache许可证的企业使用版本中，初始TTM-Q变体可通过https://huggingface.co/ibm-granite/granite-timeseries-ttm-r1获取，最新变体（TTM-B、TTM-E、TTM-A）权重可通过https://huggingface.co/ibm-granite/granite-timeseries-ttm-r2获取。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/