Differential privacy is often studied under two different models of neighboring datasets: the add-remove model and the swap model. While the swap model is used extensively in the academic literature, many practical libraries use the more conservative add-remove model. However, analysis under the add-remove model can be cumbersome, and obtaining results with tight constants requires some additional work. Here, we study the problem of one-dimensional mean estimation under the add-remove model of differential privacy. We propose a new algorithm and show that it is min-max optimal, that it has the correct constant in the leading term of the mean squared error, and that this constant is the same as the optimal algorithm in the swap model. Our results show that, for mean estimation, the add-remove and swap model give nearly identical error even though the add-remove model cannot treat the size of the dataset as public information. In addition, we demonstrate empirically that our proposed algorithm yields a factor of two improvement in mean squared error over algorithms often used in practice.
翻译:差分隐私通常在两种不同的相邻数据集模型下进行研究:添加-移除模型和交换模型。虽然交换模型在学术文献中被广泛使用,但许多实际库采用更为保守的添加-移除模型。然而,在添加-移除模型下的分析可能较为繁琐,若要获得具有严格常数的结果,还需额外的工作。本文研究了差分隐私添加-移除模型下的一维均值估计问题。我们提出了一种新算法,并证明该算法是极小化最优的,其均方误差主导项中的常数是正确的,且该常数与交换模型中的最优算法相同。我们的结果表明,对于均值估计而言,尽管添加-移除模型无法将数据集大小视为公共信息,但其与交换模型产生的误差近乎相同。此外,我们通过实验证明,与实践中常用的算法相比,本文算法在均方误差上可实现两倍的改进。