We consider privacy in the context of streaming algorithms for cardinality estimation. We show that a large class of algorithms all satisfy $\epsilon$-differential privacy, so long as (a) the algorithm is combined with a simple down-sampling procedure, and (b) the cardinality of the input stream is $\Omega(k/\epsilon)$. Here, $k$ is a certain parameter of the sketch that is always at most the sketch size in bits, but is typically much smaller. We also show that, even with no modification, algorithms in our class satisfy $(\epsilon, \delta)$-differential privacy, where $\delta$ falls exponentially with the stream cardinality. Our analysis applies to essentially all popular cardinality estimation algorithms, and substantially generalizes and tightens privacy bounds from earlier works.
翻译:我们考虑流式基数估计算法中的隐私问题。研究表明,只要满足以下两个条件,一大类算法均能实现$\epsilon$-差分隐私:(a) 算法与简单下采样过程相结合;(b) 输入流的基数为$\Omega(k/\epsilon)$。其中,$k$是摘要的特定参数,其值始终不超过以比特为单位的摘要大小,但通常远小于该值。我们还证明,即使不作任何修改,该类算法也能满足$(\epsilon, \delta)$-差分隐私,且$\delta$随流基数呈指数级下降。我们的分析适用于几乎所有主流的基数估计算法,并显著推广和严格化了早期工作中的隐私边界。