Temporal data, notably time series and spatio-temporal data, are prevalent in real-world applications. They capture dynamic system measurements and are produced in vast quantities by both physical and virtual sensors. Analyzing these data types is vital to harnessing the rich information they encompass and thus benefits a wide range of downstream tasks. Recent advances in large language and other foundational models have spurred increased use of these models in time series and spatio-temporal data mining. Such methodologies not only enable enhanced pattern recognition and reasoning across diverse domains but also lay the groundwork for artificial general intelligence capable of comprehending and processing common temporal data. In this survey, we offer a comprehensive and up-to-date review of large models tailored (or adapted) for time series and spatio-temporal data, spanning four key facets: data types, model categories, model scopes, and application areas/tasks. Our objective is to equip practitioners with the knowledge to develop applications and further research in this underexplored domain. We primarily categorize the existing literature into two major clusters: large models for time series analysis (LM4TS) and spatio-temporal data mining (LM4STD). On this basis, we further classify research based on model scopes (i.e., general vs. domain-specific) and application areas/tasks. We also provide a comprehensive collection of pertinent resources, including datasets, model assets, and useful tools, categorized by mainstream applications. This survey coalesces the latest strides in large model-centric research on time series and spatio-temporal data, underscoring the solid foundations, current advances, practical applications, abundant resources, and future research opportunities.
翻译:时态数据,特别是时间序列和时空数据,在现实应用中普遍存在。它们捕捉动态系统的测量值,由物理和虚拟传感器大量生成。分析这些数据类型对于利用其蕴含的丰富信息至关重要,因此有益于广泛的后续任务。大型语言模型及其他基础模型的最新进展,推动了这些模型在时间序列与时空数据挖掘中的广泛应用。此类方法不仅能够在不同领域实现增强的模式识别与推理,还为能够理解和处理常见时态数据的通用人工智能奠定了基础。在本综述中,我们对针对(或适配于)时间序列与时空数据的大模型进行了全面且最新的回顾,涵盖四个关键方面:数据类型、模型类别、模型范围及应用领域/任务。我们的目标是使从业者掌握在该未充分探索领域开展应用与进一步研究所需的知识。我们将现有文献主要分为两大集群:用于时间序列分析的大模型(LM4TS)和用于时空数据挖掘的大模型(LM4STD)。在此基础上,我们根据模型范围(即通用型与领域特定型)及应用领域/任务进一步对研究进行分类。我们还提供了相关资源的全面汇总,包括数据集、模型资产及实用工具,并按主流应用进行了分类。本综述汇聚了以时间序列与时空数据为中心的大模型研究的最新进展,凸显了坚实根基、当前进展、实际应用、丰富资源及未来研究机遇。