Probabilistic counters are well-known tools often used for space-efficient set cardinality estimation. In this paper, we investigate probabilistic counters from the perspective of preserving privacy. We use the standard, rigid differential privacy notion. The intuition is that the probabilistic counters do not reveal too much information about individuals but provide only general information about the population. Therefore, they can be used safely without violating the privacy of individuals. However, it turned out, that providing a precise, formal analysis of the privacy parameters of probabilistic counters is surprisingly difficult and needs advanced techniques and a very careful approach. We demonstrate that probabilistic counters can be used as a privacy protection mechanism without extra randomization. Namely, the inherent randomization from the protocol is sufficient for protecting privacy, even if the probabilistic counter is used multiple times. In particular, we present a specific privacy-preserving data aggregation protocol based on Morris Counter and MaxGeo Counter. Some of the presented results are devoted to counters that have not been investigated so far from the perspective of privacy protection. Another part is an improvement of previous results. We show how our results can be used to perform distributed surveys and compare the properties of counter-based solutions and a standard Laplace method.
翻译:概率计数器是众所周知的工具,常用于空间高效的集合基数估计。本文从隐私保护的角度研究概率计数器。我们采用标准、严格的差分隐私定义。其直观思想在于,概率计数器不会泄露过多关于个体的信息,而仅提供关于群体的总体信息。因此,它们可以安全使用而不会侵犯个体隐私。然而,对概率计数器的隐私参数进行精确的形式化分析异常困难,需要先进的技术和极其谨慎的方法。我们证明,概率计数器无需额外随机化即可作为隐私保护机制。具体而言,协议固有的随机化足以保护隐私,即使概率计数器被多次使用。特别地,我们提出了一种基于Morris计数器和MaxGeo计数器的隐私保护数据聚合协议。部分研究结果针对尚未从隐私保护角度进行深入探究的计数器类型,另一部分则是对先前结果的改进。我们展示了如何利用这些结果进行分布式调查,并比较基于计数器的解决方案与标准拉普拉斯方法的特性。