Temporal data, notably time series and spatio-temporal data, are prevalent in real-world applications. They capture dynamic system measurements and are produced in vast quantities by both physical and virtual sensors. Analyzing these data types is vital to harnessing the rich information they encompass and thus benefits a wide range of downstream tasks. Recent advances in large language and other foundational models have spurred increased use of these models in time series and spatio-temporal data mining. Such methodologies not only enable enhanced pattern recognition and reasoning across diverse domains but also lay the groundwork for artificial general intelligence capable of comprehending and processing common temporal data. In this survey, we offer a comprehensive and up-to-date review of large models tailored (or adapted) for time series and spatio-temporal data, spanning four key facets: data types, model categories, model scopes, and application areas/tasks. Our objective is to equip practitioners with the knowledge to develop applications and further research in this underexplored domain. We primarily categorize the existing literature into two major clusters: large models for time series analysis (LM4TS) and spatio-temporal data mining (LM4STD). On this basis, we further classify research based on model scopes (i.e., general vs. domain-specific) and application areas/tasks. We also provide a comprehensive collection of pertinent resources, including datasets, model assets, and useful tools, categorized by mainstream applications. This survey coalesces the latest strides in large model-centric research on time series and spatio-temporal data, underscoring the solid foundations, current advances, practical applications, abundant resources, and future research opportunities.
翻译:时间序列与时空数据等时序数据在现实应用中普遍存在。这类数据通过物理传感器与虚拟传感器大规模采集,动态反映系统运行状态。挖掘此类数据类型所蕴含的丰富信息,对推动广泛的下游任务具有重要意义。近年来,大型语言模型与其他基础模型的突破性进展,显著促进了这些模型在时间序列与时空数据挖掘领域的应用。此类方法不仅能提升跨领域的模式识别与推理能力,更为实现具备通用时序数据理解与处理能力的人工通用智能奠定基础。本综述对面向(或适配)时间序列与时空数据的大规模模型进行全面且前沿的回顾,涵盖四大核心维度:数据类型、模型类别、模型范围以及应用领域/任务。我们致力于为从业者提供在该待深入探索领域开展应用开发与后续研究所需的知识体系。现有文献主要分为两大聚类:面向时间序列分析的大规模模型与面向时空数据挖掘的大规模模型。在此基础上,我们进一步依据模型范围(通用型与领域专用型)及应用领域/任务进行细分研究。此外,我们按主流应用归类整理了涵盖数据集、模型资产与实用工具的综合性资源手册。本综述系统整合了以大规模模型为核心的时间序列与时空数据领域最新研究进展,重点揭示其坚实基础、当前突破、实际应用、丰富资源及未来研究机遇。