Location data is collected from users continuously to understand their mobility patterns. Releasing the user trajectories may compromise user privacy. Therefore, the general practice is to release aggregated location datasets. However, private information may still be inferred from an aggregated version of location trajectories. Differential privacy (DP) protects the query output against inference attacks regardless of background knowledge. This paper presents a differential privacy-based privacy model that protects the user's origins and destinations from being inferred from aggregated mobility datasets. This is achieved by injecting Planar Laplace noise to the user origin and destination GPS points. The noisy GPS points are then transformed into a link representation using a link-matching algorithm. Finally, the link trajectories form an aggregated mobility network. The injected noise level is selected using the Sparse Vector Mechanism. This DP selection mechanism considers the link density of the location and the functional category of the localized links. Compared to the different baseline models, including a k-anonymity method, our differential privacy-based aggregation model offers query responses that are close to the raw data in terms of aggregate statistics at both the network and trajectory-levels with maximum 9% deviation from the baseline in terms of network length.
翻译:用户位置数据被持续收集以理解其移动模式。发布用户轨迹可能损害用户隐私,因此常规做法是发布聚合位置数据集。然而,即便对位置轨迹进行聚合处理,仍可能推断出隐私信息。差分隐私技术能在不考虑背景知识的情况下,保护查询输出免受推断攻击。本文提出一种基于差分隐私的隐私保护模型,该模型能够保护用户的出发地和目的地不被从聚合移动数据集中推断出来。通过向用户出发地和目的地GPS点注入平面拉普拉斯噪声实现这一目标,随后利用链路匹配算法将含噪GPS点转换为链路表示,最终链路轨迹形成聚合移动网络。注入噪声水平通过稀疏向量机制进行选择,该差分隐私选择机制考虑了位置链路密度以及局部化链路的功能类别。与包括k-匿名方法在内的不同基线模型相比,我们的差分隐私聚合模型在网络和轨迹两个层面提供的聚合统计查询响应均接近原始数据,网络长度与基线相比最大偏差为9%。