Large-scale data missing is a challenging problem in Intelligent Transportation Systems (ITS). Many studies have been carried out to impute large-scale traffic data by considering their spatiotemporal correlations at a network level. In existing traffic data imputations, however, rich semantic information of a road network has been largely ignored when capturing network-wide spatiotemporal correlations. This study proposes a Graph Transformer for Traffic Data Imputation (GT-TDI) model to impute large-scale traffic data with spatiotemporal semantic understanding of a road network. Specifically, the proposed model introduces semantic descriptions consisting of network-wide spatial and temporal information of traffic data to help the GT-TDI model capture spatiotemporal correlations at a network level. The proposed model takes incomplete data, the social connectivity of sensors, and semantic descriptions as input to perform imputation tasks with the help of Graph Neural Networks (GNN) and Transformer. On the PeMS freeway dataset, extensive experiments are conducted to compare the proposed GT-TDI model with conventional methods, tensor factorization methods, and deep learning-based methods. The results show that the proposed GT-TDI outperforms existing methods in complex missing patterns and diverse missing rates. The code of the GT-TDI model will be available at https://github.com/KP-Zhang/GT-TDI.
翻译:大规模数据缺失是智能交通系统中的一个挑战性问题。已有许多研究通过考虑交通数据在网络层面的时空关联性来进行填补。然而,在现有交通数据填补方法中,道路网络丰富的语义信息在捕捉全网时空关联时往往被忽略。本研究提出了一种用于交通数据填补的图Transformer模型(GT-TDI),通过理解道路网络的时空语义实现大规模交通数据的填补。具体而言,该模型引入了包含全网空间与时间信息的语义描述,帮助GT-TDI模型在网络层面捕捉时空关联性。所提模型以不完整数据、传感器社会连接性及语义描述为输入,借助图神经网络(GNN)和Transformer执行填补任务。在PeMS高速公路数据集上,将GT-TDI模型与传统方法、张量分解方法及基于深度学习的方法进行了大量实验对比。结果表明,在复杂缺失模式与多种缺失率场景下,GT-TDI模型均优于现有方法。GT-TDI模型代码将在https://github.com/KP-Zhang/GT-TDI公开。