In recent years, there has been a rapid development of spatio-temporal prediction techniques in response to the increasing demands of traffic management and travel planning. While advanced end-to-end models have achieved notable success in improving predictive performance, their integration and expansion pose significant challenges. This work aims to address these challenges by introducing a spatio-temporal pre-training framework that seamlessly integrates with downstream baselines and enhances their performance. The framework is built upon two key designs: (i) We propose a spatio-temporal mask autoencoder as a pre-training model for learning spatio-temporal dependencies. The model incorporates customized parameter learners and hierarchical spatial pattern encoding networks. These modules are specifically designed to capture spatio-temporal customized representations and intra- and inter-cluster region semantic relationships, which have often been neglected in existing approaches. (ii) We introduce an adaptive mask strategy as part of the pre-training mechanism. This strategy guides the mask autoencoder in learning robust spatio-temporal representations and facilitates the modeling of different relationships, ranging from intra-cluster to inter-cluster, in an easy-to-hard training manner. Extensive experiments conducted on representative benchmarks demonstrate the effectiveness of our proposed method. We have made our model implementation publicly available at https://github.com/HKUDS/GPT-ST.
翻译:近年来,随着交通管理和出行规划需求的日益增长,时空预测技术迅速发展。尽管先进的端到端模型在提升预测性能方面取得了显著成功,但其集成与扩展面临重大挑战。本文旨在通过引入一种与下游基线模型无缝集成并提升其性能的时空预训练框架来应对这些挑战。该框架基于两个关键设计:(i)我们提出了一种时空掩码自编码器作为预训练模型,用于学习时空依赖性。该模型集成了定制化参数学习器和分层空间模式编码网络。这些模块专门设计用于捕获时空定制化表示以及簇内和簇间区域语义关系,而这些在现有方法中常被忽视。(ii)我们引入了一种自适应掩码策略作为预训练机制的一部分。该策略引导掩码自编码器学习鲁棒的时空表示,并通过由易到难的训练方式促进对不同关系(从簇内到簇间)的建模。在代表性基准测试上开展的大量实验证明了我们提出方法的有效性。我们已在 https://github.com/HKUDS/GPT-ST 公开了模型实现代码。