Trip data that records each vehicle's trip activity on the road network describes the operation of urban traffic from the individual perspective, and it is extremely valuable for transportation research. However, restricted by data privacy, the trip data of individual-level cannot be opened for all researchers, while the need for it is very urgent. In this paper, we produce a city-scale synthetic individual-level vehicle trip dataset by generating for each individual based on the historical trip data, where the availability and trip data privacy protection are balanced. Privacy protection inevitably affects the availability of data. Therefore, we have conducted numerous experiments to demonstrate the performance and reliability of the synthetic data in different dimensions and at different granularities to help users properly judge the tasks it can perform. The result shows that the synthetic data is consistent with the real data (i.e., historical data) on the aggregated level and reasonable from the individual perspective.
翻译:记录每辆车在道路网络上出行活动的出行数据,从个体角度描述了城市交通的运行状况,对交通研究具有极高价值。然而,受数据隐私限制,个体级的出行数据无法向所有研究人员开放,而该数据的需求又极为迫切。本文通过基于历史出行数据为每个个体生成数据的方式,构建了城市尺度的合成个体级车辆出行数据集,在数据可用性与隐私保护之间取得平衡。隐私保护不可避免地会影响数据的可用性。为此,我们开展了大量实验,从不同维度、不同粒度展示合成数据的性能与可靠性,以帮助用户合理判断其可胜任的任务。结果表明,合成数据在聚合层面与真实数据(即历史数据)保持一致,且从个体视角来看具有合理性。