Probabilistic counters are well-known tools often used for space-efficient set cardinality estimation. In this paper, we investigate probabilistic counters from the perspective of preserving privacy. We use the standard, rigid differential privacy notion. The intuition is that the probabilistic counters do not reveal too much information about individuals but provide only general information about the population. Therefore, they can be used safely without violating the privacy of individuals. However, it turned out, that providing a precise, formal analysis of the privacy parameters of probabilistic counters is surprisingly difficult and needs advanced techniques and a very careful approach. We demonstrate that probabilistic counters can be used as a privacy protection mechanism without extra randomization. Namely, the inherent randomization from the protocol is sufficient for protecting privacy, even if the probabilistic counter is used multiple times. In particular, we present a specific privacy-preserving data aggregation protocol based on Morris Counter and MaxGeo Counter. Some of the presented results are devoted to counters that have not been investigated so far from the perspective of privacy protection. Another part is an improvement of previous results. We show how our results can be used to perform distributed surveys and compare the properties of counter-based solutions and a standard Laplace method.
翻译:概率计数器是常用于空间高效集合基数估计的经典工具。本文从隐私保护的角度研究概率计数器。我们采用标准且严格的差分隐私概念。直观而言,概率计数器不会泄露过多个体信息,仅提供关于群体的总体信息。因此,它们能被安全使用而不侵犯个体隐私。然而,事实证明,对概率计数器的隐私参数进行精确的形式化分析异常困难,需借助先进技术与极其严谨的方法。我们证明,概率计数器可作为无需额外随机化的隐私保护机制:即使重复使用概率计数器,协议内在的随机化特性足以保障隐私。具体而言,我们提出一种基于莫里斯计数器与最大-几何计数器的特定隐私保护数据聚合协议。部分结果聚焦于此前未经隐私视角考察的计数器类型;另一部分则是对既有结果的改进。我们展示了如何利用这些结果开展分布式调查,并对比了基于计数器的方案与标准拉普拉斯方法的特性。