Time Series Forecasting (TSF) is key functionality in numerous fields, including in finance, weather services, and energy management. While TSF methods are emerging these days, many of them require domain-specific data collection and model training and struggle with poor generalization performance on new domains. Foundation models aim to overcome this limitation. Pre-trained on large-scale language or time series data, they exhibit promising inferencing capabilities in new or unseen data. This has spurred a surge in new TSF foundation models. We propose a new benchmark, FoundTS, to enable thorough and fair evaluation and comparison of such models. FoundTS covers a variety of TSF foundation models, including those based on large language models and those pretrained on time series. Next, FoundTS supports different forecasting strategies, including zero-shot, few-shot, and full-shot, thereby facilitating more thorough evaluations. Finally, FoundTS offers a pipeline that standardizes evaluation processes such as dataset splitting, loading, normalization, and few-shot sampling, thereby facilitating fair evaluations. Building on this, we report on an extensive evaluation of TSF foundation models on a broad range of datasets from diverse domains and with different statistical characteristics. Specifically, we identify pros and cons and inherent limitations of existing foundation models, and we identify directions for future model design. We make our code and datasets available at https://anonymous.4open.science/r/FoundTS-C2B0.
翻译:时间序列预测(TSF)是金融、气象服务和能源管理等多个领域的关键功能。尽管近年来TSF方法不断涌现,但其中许多方法需要特定领域的数据收集和模型训练,且在新领域泛化性能较差。基础模型旨在克服这一限制。通过在大规模语言或时间序列数据上进行预训练,它们在新数据或未见数据上展现出有前景的推理能力。这引发了新一轮TSF基础模型的研究热潮。我们提出了一个新的基准测试FoundTS,以实现对此类模型的全面公平评估与比较。FoundTS涵盖多种TSF基础模型,包括基于大语言模型的模型和在时间序列上预训练的模型。其次,FoundTS支持不同的预测策略,包括零样本、少样本和全样本预测,从而促进更全面的评估。最后,FoundTS提供了一个标准化评估流程的管道,包括数据集划分、加载、归一化和少样本采样等,从而确保评估的公平性。在此基础上,我们对TSF基础模型在来自不同领域、具有不同统计特性的广泛数据集上进行了全面评估。具体而言,我们识别出现有基础模型的优缺点及固有局限性,并为未来模型设计指明了方向。我们的代码和数据集已在https://anonymous.4open.science/r/FoundTS-C2B0公开。