Differential privacy (DP) is a widely used notion for reasoning about privacy when publishing aggregate data. In this paper, we observe that certain DP mechanisms are amenable to a posteriori privacy analysis that exploits the fact that some outputs leak less information about the input database than others. To exploit this phenomenon, we introduce output differential privacy (ODP) and a new composition experiment, and leverage these new constructs to obtain significant privacy budget savings and improved privacy-utility tradeoffs under composition. All of this comes at no cost in terms of privacy; we do not weaken the privacy guarantee. To demonstrate the applicability of our a posteriori privacy analysis techniques, we analyze two well-known mechanisms: the Sparse Vector Technique and the Propose-Test-Release framework. We then show how our techniques can be used to save privacy budget in more general contexts: when a differentially private iterative mechanism terminates before its maximal number of iterations is reached, and when the output of a DP mechanism provides unsatisfactory utility. Examples of the former include iterative optimization algorithms, whereas examples of the latter include training a machine learning model with a large generalization error. Our techniques can be applied beyond the current paper to refine the analysis of existing DP mechanisms or guide the design of future mechanisms.
翻译:差分隐私(DP)是发布聚合数据时广泛用于隐私推理的概念。本文观察到,某些差分隐私机制适用于后验隐私分析,该分析利用了以下事实:部分输出相对于输入数据库泄露的信息比其他输出更少。为利用这一现象,我们引入了输出差分隐私(ODP)及其新的复合实验,并通过这些新构造实现了在复合分析下的显著隐私预算节省和更优的隐私-效用权衡。所有这些均未削弱隐私保障本身。为展示后验隐私分析技术的适用性,我们分析了两种经典机制:稀疏向量技术与提议-测试-发布框架。随后,我们展示了这些技术如何应用于更一般场景以节省隐私预算:当差分隐私迭代机制在达到最大迭代次数前终止时,以及当差分隐私机制的输出未能提供满意的效用时。前者示例包括迭代优化算法,后者示例包括训练具有较大泛化误差的机器学习模型。我们的技术可超越本文范围,用于改进现有差分隐私机制的分析或指导未来机制的设计。