FoundTS: Comprehensive and Unified Benchmarking of Foundation Models for Time Series Forecasting

Time Series Forecasting (TSF) is key functionality in numerous fields, including in finance, weather services, and energy management. While TSF methods are emerging these days, many of them require domain-specific data collection and model training and struggle with poor generalization performance on new domains. Foundation models aim to overcome this limitation. Pre-trained on large-scale language or time series data, they exhibit promising inferencing capabilities in new or unseen data. This has spurred a surge in new TSF foundation models. We propose a new benchmark, FoundTS, to enable thorough and fair evaluation and comparison of such models. FoundTS covers a variety of TSF foundation models, including those based on large language models and those pretrained on time series. Next, FoundTS supports different forecasting strategies, including zero-shot, few-shot, and full-shot, thereby facilitating more thorough evaluations. Finally, FoundTS offers a pipeline that standardizes evaluation processes such as dataset splitting, loading, normalization, and few-shot sampling, thereby facilitating fair evaluations. Building on this, we report on an extensive evaluation of TSF foundation models on a broad range of datasets from diverse domains and with different statistical characteristics. Specifically, we identify pros and cons and inherent limitations of existing foundation models, and we identify directions for future model design. We make our code and datasets available at https://anonymous.4open.science/r/FoundTS-C2B0.

翻译：时间序列预测（TSF）是金融、气象服务和能源管理等多个领域的关键功能。尽管近年来TSF方法不断涌现，但其中许多方法需要特定领域的数据收集和模型训练，且在新领域泛化性能较差。基础模型旨在克服这一限制。通过在大规模语言或时间序列数据上进行预训练，它们在新数据或未见数据上展现出有前景的推理能力。这引发了新一轮TSF基础模型的研究热潮。我们提出了一个新的基准测试FoundTS，以实现对此类模型的全面公平评估与比较。FoundTS涵盖多种TSF基础模型，包括基于大语言模型的模型和在时间序列上预训练的模型。其次，FoundTS支持不同的预测策略，包括零样本、少样本和全样本预测，从而促进更全面的评估。最后，FoundTS提供了一个标准化评估流程的管道，包括数据集划分、加载、归一化和少样本采样等，从而确保评估的公平性。在此基础上，我们对TSF基础模型在来自不同领域、具有不同统计特性的广泛数据集上进行了全面评估。具体而言，我们识别出现有基础模型的优缺点及固有局限性，并为未来模型设计指明了方向。我们的代码和数据集已在https://anonymous.4open.science/r/FoundTS-C2B0公开。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日