We study the problem of counting the number of distinct elements in a dataset subject to the constraint of differential privacy. We consider the challenging setting of person-level DP (a.k.a. user-level DP) where each person may contribute an unbounded number of items and hence the sensitivity is unbounded. Our approach is to compute a bounded-sensitivity version of this query, which reduces to solving a max-flow problem. The sensitivity bound is optimized to balance the noise we must add to privatize the answer against the error of the approximation of the bounded-sensitivity query to the true number of unique elements.
翻译:我们研究了在差分隐私约束下统计数据集中不同元素数量的问题。我们考虑了具有挑战性的人级差分隐私(也称为用户级差分隐私)场景,其中每个人可能贡献无限数量的项目,因此敏感性是无限的。我们的方法是计算该查询的有界敏感性版本,这转化为求解最大流问题。通过优化敏感性边界,我们平衡了为隐私化答案而必须添加的噪声,与有界敏感性查询相对于真实唯一元素数量的近似误差。