Deep-learning based traffic prediction models require vast amounts of data to learn embedded spatial and temporal dependencies. The inherent privacy and commercial sensitivity of such data has encouraged a shift towards decentralised data-driven methods, such as Federated Learning (FL). Under a traditional Machine Learning paradigm, traffic flow prediction models can capture spatial and temporal relationships within centralised data. In reality, traffic data is likely distributed across separate data silos owned by multiple stakeholders. In this work, a cross-silo FL setting is motivated to facilitate stakeholder collaboration for optimal traffic flow prediction applications. This work introduces an FL framework, referred to as FedTPS, to generate synthetic data to augment each client's local dataset by training a diffusion-based trajectory generation model through FL. The proposed framework is evaluated on a large-scale real world ride-sharing dataset using various FL methods and Traffic Flow Prediction models, including a novel prediction model we introduce, which leverages Temporal and Graph Attention mechanisms to learn the Spatio-Temporal dependencies embedded within regional traffic flow data. Experimental results show that FedTPS outperforms multiple other FL baselines with respect to global model performance.
翻译:基于深度学习的交通预测模型需要大量数据来学习其中嵌入的空间与时间依赖性。此类数据固有的隐私性和商业敏感性促使研究转向去中心化的数据驱动方法,例如联邦学习(FL)。在传统机器学习范式下,交通流预测模型能够从集中式数据中捕捉时空关联。然而现实中,交通数据很可能分散在多个利益相关方拥有的独立数据孤岛中。本研究提出一种跨孤岛联邦学习框架,旨在促进利益相关方协作以实现最优交通流预测应用。本文提出一个称为FedTPS的联邦学习框架,通过联邦训练基于扩散的轨迹生成模型,为每个客户端本地数据集生成合成数据以进行增强。该框架在大型真实世界网约车数据集上,结合多种联邦学习方法与交通流预测模型(包括我们提出的新型预测模型)进行评估。该新型模型利用时间与图注意力机制学习区域交通流数据中嵌入的时空依赖性。实验结果表明,FedTPS在全局模型性能方面优于多种其他联邦学习基线方法。