Privacy-Preserving Synthetic Dataset of Individual Daily Trajectories for City-Scale Mobility Analytics

Urban mobility data are indispensable for urban planning, transportation demand forecasting, pandemic modeling, and many other applications; however, individual mobile phone-derived Global Positioning System traces cannot generally be shared with third parties owing to severe re-identification risks. Aggregated records, such as origin-destination (OD) matrices, offer partial insights but fail to capture the key behavioral properties of daily human movement, limiting realistic city-scale analyses. This study presents a privacy-preserving synthetic mobility dataset that reconstructs daily trajectories from aggregated inputs. The proposed method integrates OD flows with two complementary behavioral constraints: (1) dwell-travel time quantiles that are available only as coarse summary statistics and (2) the universal law for the daily distribution of the number of visited locations. Embedding these elements in a multi-objective optimization framework enables the reproduction of realistic distributions of human mobility while ensuring that no personal identifiers are required. The proposed framework is validated in two contrasting regions of Japan: (1) the 23 special wards of Tokyo, representing a dense metropolitan environment; and (2) Fukuoka Prefecture, where urban and suburban mobility patterns coexist. The resulting synthetic mobility data reproduce dwell-travel time and visit frequency distributions with high fidelity, while deviations in OD consistency remain within the natural range of daily fluctuations. The results of this study establish a practical synthesis pathway under real-world constraints, providing governments, urban planners, and industries with scalable access to high-resolution mobility data for reliable analytics without the need for sensitive personal records, and supporting practical deployments in policy and commercial domains.

翻译：城市移动数据对于城市规划、交通需求预测、疫情建模等诸多应用不可或缺；然而，由于存在严重的重识别风险，源自个体手机的全球定位系统轨迹通常无法与第三方共享。聚合记录（如起讫点矩阵）虽能提供部分洞见，但无法捕捉人类日常移动的关键行为特性，限制了现实城市尺度分析的开展。本研究提出一种从聚合输入重建日常轨迹的隐私保护型合成移动数据集。所提方法将起讫点流量与两个互补的行为约束相结合：(1) 仅以粗略汇总统计量形式可用的停留-出行时间分位数；(2) 访问地点数量日分布普适定律。将这些要素嵌入多目标优化框架，能够在确保无需个人标识符的前提下，复现真实的人类移动分布。该框架在日本两个对比性区域得到验证：(1) 代表密集都市环境的东京23个特别区；(2) 城市与郊区移动模式共存的福冈县。所得合成移动数据以高保真度复现了停留-出行时间与访问频率分布，而起讫点一致性的偏差保持在日常波动的自然范围内。本研究结果建立了现实约束下的实用合成路径，为政府、城市规划者和产业界提供了无需敏感个人记录即可获取高分辨率移动数据的可扩展途径，支持政策与商业领域的实际部署。