We present a data analytics system that ensures accurate counts can be released with differential privacy and minimal onboarding effort while showing instances that outperform other approaches that require more onboarding effort. The primary difference between our proposal and existing approaches is that it does not rely on user contribution bounds over distinct elements, i.e. $\ell_0$-sensitivity bounds, which can significantly bias counts. Contribution bounds for $\ell_0$-sensitivity have been considered as necessary to ensure differential privacy, but we show that this is actually not necessary and can lead to releasing more results that are more accurate. We require minimal hyperparameter tuning and demonstrate results on several publicly available dataset. We hope that this approach will help differential privacy scale to many different data analytics applications.
翻译:我们提出了一种数据分析系统,该系统能够在差分隐私保证下发布准确计数,同时最小化接入工作量,并在多个实例中优于需要更多接入工作的其他方法。我们方案与现有方法的主要区别在于,它不依赖于对不同元素的用户贡献界(即$\ell_0$-灵敏度界),而后者会显著偏差计数。$\ell_0$-灵敏度的贡献界一直被认为是确保差分隐私的必要条件,但我们证明这实际上并非必要,且可能导致发布更多但更不准确的结果。我们仅需极少的超参数调优,并在多个公开数据集上展示了结果。希望这一方法能推动差分隐私扩展到众多数据分析应用场景。