We introduce MOMENT, a family of open-source foundation models for general-purpose time-series analysis. Pre-training large models on time-series data is challenging due to (1) the absence of a large and cohesive public time-series repository, and (2) diverse time-series characteristics which make multi-dataset training onerous. Additionally, (3) experimental benchmarks to evaluate these models, especially in scenarios with limited resources, time, and supervision, are still in their nascent stages. To address these challenges, we compile a large and diverse collection of public time-series, called the Time-series Pile, and systematically tackle time-series-specific challenges to unlock large-scale multi-dataset pre-training. Finally, we build on recent work to design a benchmark to evaluate time-series foundation models on diverse tasks and datasets in limited supervision settings. Experiments on this benchmark demonstrate the effectiveness of our pre-trained models with minimal data and task-specific fine-tuning. Finally, we present several interesting empirical observations about large pre-trained time-series models. Our code is available anonymously at anonymous.4open.science/r/BETT-773F/.
翻译:我们提出MOMENT系列开源基础模型,用于通用时间序列分析。在大规模时间序列数据上预训练模型面临以下挑战:(1)缺乏统一且规模化的公共时间序列数据集;(2)时间序列特征的多样性导致多数据集联合训练困难;(3)针对有限资源、时间和监督场景下的模型评估基准仍处于早期阶段。为应对这些挑战,我们构建了大规模多样化公共时间序列集合——Time-series Pile,系统解决了时间序列特有的技术难点,实现了跨数据集的大规模预训练。基于最新研究成果,我们设计了面向弱监督场景的基准测试框架,评估时间序列基础模型在多样化任务和数据集上的性能。实验表明,仅需极少量数据与任务微调,预训练模型即可展现出显著效果。最后,我们揭示了大规模预训练时间序列模型的若干有趣实证发现。代码已匿名发布于anonymous.4open.science/r/BETT-773F/。