PRIVIC: A privacy-preserving method for incremental collection of location data

With recent advancements in technology, the threats of privacy violations of individuals' sensitive data are surging. Location data, in particular, have been shown to carry a substantial amount of sensitive information. A standard method to mitigate the privacy risks for location data consists in adding noise to the true values to achieve geo-indistinguishability (geo-ind). However, geo-ind alone is not sufficient to cover all privacy concerns. In particular, isolated locations are not sufficiently protected by the state-of-the-art Laplace mechanism (LAP) for geo-ind. In this paper, we focus on a mechanism based on the Blahut-Arimoto algorithm (BA) from the rate-distortion theory. We show that BA, in addition to providing geo-ind, enforces an elastic metric that mitigates the problem of isolation. Furthermore, BA provides an optimal trade-off between information leakage and quality of service. We then proceed to study the utility of BA in terms of the statistics that can be derived from the reported data, focusing on the inference of the original distribution. To this purpose, we de-noise the reported data by applying the iterative Bayesian update (IBU), an instance of the expectation-maximization method. It turns out that BA and IBU are dual to each other, and as a result, they work well together, in the sense that the statistical utility of BA is quite good and better than LAP for high privacy levels. Exploiting these properties of BA and IBU, we propose an iterative method, PRIVIC, for a privacy-friendly incremental collection of location data from users by service providers. We illustrate the soundness and functionality of our method both analytically and with experiments.

翻译：随着近年技术的进步，个人敏感数据隐私侵犯的威胁日益加剧。特别是位置数据已被证实携带大量敏感信息。一种缓解位置数据隐私风险的标准方法是通过向真实值添加噪声来实现地理不可区分性（geo-ind）。然而，仅靠地理不可区分性并不能覆盖所有隐私问题。尤其当前最先进的拉普拉斯机制（LAP）对孤立位置的保护不足。本文聚焦于基于率失真理论中Blahut-Arimoto算法（BA）的机制。研究表明，BA除了提供地理不可区分性外，还能实施一种弹性度量以缓解位置孤立问题。此外，BA在信息泄露与服务质量之间实现了最优权衡。我们进而从可基于报告数据推导的统计量角度（重点为原始分布推断）研究BA的实用性。为此，我们通过应用迭代贝叶斯更新（IBU，期望最大化方法的一种实例）对报告数据进行去噪。结果发现BA与IBU互为对偶，因此二者协同工作良好：在高隐私保护级别下，BA的统计效用优于LAP。利用BA和IBU的这些特性，我们提出一种迭代方法PRIVIC，用于服务提供商从用户处进行隐私友好的位置数据增量采集。我们通过理论分析与实验验证了该方法在合理性与功能性上的有效性。