Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series

Large pre-trained models excel in zero/few-shot learning for language and vision tasks but face challenges in multivariate time series (TS) forecasting due to diverse data characteristics. Consequently, recent research efforts have focused on developing pre-trained TS forecasting models. These models, whether built from scratch or adapted from large language models (LLMs), excel in zero/few-shot forecasting tasks. However, they are limited by slow performance, high computational demands, and neglect of cross-channel and exogenous correlations. To address this, we introduce Tiny Time Mixers (TTM), a compact model (starting from 1M parameters) with effective transfer learning capabilities, trained exclusively on public TS datasets. TTM, based on the light-weight TSMixer architecture, incorporates innovations like adaptive patching, diverse resolution sampling, and resolution prefix tuning to handle pre-training on varied dataset resolutions with minimal model capacity. Additionally, it employs multi-level modeling to capture channel correlations and infuse exogenous signals during fine-tuning. TTM outperforms existing popular benchmarks in zero/few-shot forecasting by (4-40\%), while reducing computational requirements significantly. Moreover, TTMs are lightweight and can be executed even on CPU-only machines, enhancing usability and fostering wider adoption in resource-constrained environments. Model weights for our initial variant (TTM-Q) are available at https://huggingface.co/ibm-granite/granite-timeseries-ttm-v1. Model weights for more sophisticated variants (TTM-B, TTM-E, and TTM-A) will be shared soon. The source code for TTM can be accessed at https://github.com/ibm-granite/granite-tsfm/tree/main/tsfm_public/models/tinytimemixer.

翻译：大型预训练模型在语言和视觉任务的零/少样本学习中表现出色，但由于多元时间序列数据特征的多样性，其在时间序列预测领域面临挑战。因此，近期研究致力于开发预训练时间序列预测模型。这些模型，无论是从头构建还是基于大型语言模型适配，在零/少样本预测任务中表现优异。然而，它们受限于推理速度慢、计算需求高，且往往忽略跨通道关联与外生变量相关性。为解决这些问题，我们提出了微型时间混合器（TTM），这是一个仅基于公开时间序列数据集训练的紧凑模型（参数量可低至100万），具备高效的迁移学习能力。TTM基于轻量级TSMixer架构，引入了自适应分块、多分辨率采样及分辨率前缀调优等创新技术，以有限模型容量处理不同分辨率数据集的预训练。此外，该模型采用多层次建模以捕捉通道相关性，并在微调阶段融入外生信号。TTM在零/少样本预测任务中优于现有主流基准模型（提升幅度达4-40%），同时显著降低计算需求。更重要的是，TTM模型轻量化，可在仅配备CPU的设备上运行，提升了可用性并促进了在资源受限环境中的广泛采用。我们初始变体（TTM-Q）的模型权重发布于https://huggingface.co/ibm-granite/granite-timeseries-ttm-v1。更复杂的变体（TTM-B、TTM-E与TTM-A）的权重将随后公开。TTM源代码可通过https://github.com/ibm-granite/granite-tsfm/tree/main/tsfm_public/models/tinytimemixer获取。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/