Collecting and analyzing evolving longitudinal data has become a common practice. One possible approach to protect the users' privacy in this context is to use local differential privacy (LDP) protocols, which ensure the privacy protection of all users even in the case of a breach or data misuse. Existing LDP data collection protocols such as Google's RAPPOR and Microsoft's dBitFlipPM can have longitudinal privacy linear to the domain size k, which is excessive for large domains, such as Internet domains. To solve this issue, in this paper we introduce a new LDP data collection protocol for longitudinal frequency monitoring named LOngitudinal LOcal HAshing (LOLOHA) with formal privacy guarantees. In addition, the privacy-utility trade-off of our protocol is only linear with respect to a reduced domain size $2\leq g \ll k$. LOLOHA combines a domain reduction approach via local hashing with double randomization to minimize the privacy leakage incurred by data updates. As demonstrated by our theoretical analysis as well as our experimental evaluation, LOLOHA achieves a utility competitive to current state-of-the-art protocols, while substantially minimizing the longitudinal privacy budget consumption by up to k/g orders of magnitude.
翻译:收集和分析动态纵向数据已成为一种常见实践。在此背景下保护用户隐私的一种可行方法是采用本地差分隐私(LDP)协议,即使发生数据泄露或滥用,该协议也能确保所有用户的隐私得到保护。现有的LDP数据收集协议(如Google的RAPPOR和Microsoft的dBitFlipPM)的纵向隐私性与域大小k呈线性关系,这对于大型域(如互联网域名)而言过于庞大。为解决此问题,本文提出了一种用于纵向频率监测的新型LDP数据收集协议——纵向局部哈希(LOLOHA),并提供了形式化的隐私保证。此外,我们协议的隐私-效用权衡仅与约减后的域大小$2\leq g \ll k$呈线性关系。LOLOHA通过将局部哈希的域约减方法与双重随机化相结合,以最小化数据更新引发的隐私泄漏。理论分析和实验评估均表明,LOLOHA在实现与当前最优协议相当的效用的同时,将纵向隐私预算消耗大幅降低至k/g数量级。