Probabilistic counters are well-known tools often used for space-efficient set cardinality estimation. In this paper, we investigate probabilistic counters from the perspective of preserving privacy. We use the standard, rigid differential privacy notion. The intuition is that the probabilistic counters do not reveal too much information about individuals but provide only general information about the population. Therefore, they can be used safely without violating the privacy of individuals. However, it turned out, that providing a precise, formal analysis of the privacy parameters of probabilistic counters is surprisingly difficult and needs advanced techniques and a very careful approach. We demonstrate that probabilistic counters can be used as a privacy protection mechanism without extra randomization. Namely, the inherent randomization from the protocol is sufficient for protecting privacy, even if the probabilistic counter is used multiple times. In particular, we present a specific privacy-preserving data aggregation protocol based on Morris Counter and MaxGeo Counter. Some of the presented results are devoted to counters that have not been investigated so far from the perspective of privacy protection. Another part is an improvement of previous results. We show how our results can be used to perform distributed surveys and compare the properties of counter-based solutions and a standard Laplace method.
翻译:概率计数器是常用于空间高效集合基数估计的知名工具。本文从隐私保护的角度研究了概率计数器,采用了严格的标准差分隐私概念。直观上,概率计数器不会泄露个体过多信息,仅提供关于总体的通用信息,因此可安全使用而不侵犯个体隐私。然而研究表明,对概率计数器的隐私参数进行精确的形式化分析异常困难,需要先进技术手段与极其审慎的方法。我们证明概率计数器无需额外随机化即可作为隐私保护机制——即使被多次使用,协议本身固有的随机性已足以保护隐私。具体而言,我们基于Morris计数器和MaxGeo计数器提出了专用隐私保护数据聚合协议。部分研究成果针对此前未从隐私保护角度研究过的计数器,另一部分则是对已有成果的改进。我们展示了如何将这些成果用于分布式调查,并比较了基于计数器的方案与标准拉普拉斯方法的特性差异。