Deep-learning based traffic prediction models require vast amounts of data to learn embedded spatial and temporal dependencies. The inherent privacy and commercial sensitivity of such data has encouraged a shift towards decentralised data-driven methods, such as Federated Learning (FL). Under a traditional Machine Learning paradigm, traffic flow prediction models can capture spatial and temporal relationships within centralised data. In reality, traffic data is likely distributed across separate data silos owned by multiple stakeholders. In this work, a cross-silo FL setting is motivated to facilitate stakeholder collaboration for optimal traffic flow prediction applications. This work introduces an FL framework, referred to as FedTPS, to generate synthetic data to augment each client's local dataset by training a diffusion-based trajectory generation model through FL. The proposed framework is evaluated on a large-scale real world ride-sharing dataset using various FL methods and Traffic Flow Prediction models, including a novel prediction model we introduce, which leverages Temporal and Graph Attention mechanisms to learn the Spatio-Temporal dependencies embedded within regional traffic flow data. Experimental results show that FedTPS outperforms multiple other FL baselines with respect to global model performance.
翻译:基于深度学习的交通预测模型需要大量数据来学习其中嵌入的空间和时间依赖性。此类数据固有的隐私性和商业敏感性促使人们转向去中心化的数据驱动方法,如联邦学习。在传统的机器学习范式下,交通流预测模型能够从集中式数据中捕捉时空关系。然而现实中,交通数据很可能分散在多个利益相关方拥有的独立数据孤岛中。本研究提出一种跨数据孤岛的联邦学习设置,以促进利益相关方协作,实现最优的交通流预测应用。本文介绍了一个名为FedTPS的联邦学习框架,该框架通过联邦学习训练一个基于扩散的轨迹生成模型,为每个客户端的本地数据集生成合成数据以进行增强。所提框架在一个大规模真实世界网约车数据集上,使用多种联邦学习方法及交通流预测模型(包括我们引入的一种新颖预测模型)进行了评估;该新模型利用时间和图注意力机制来学习区域交通流数据中嵌入的时空依赖性。实验结果表明,在全局模型性能方面,FedTPS优于多种其他联邦学习基线方法。