Privacy estimation techniques for differentially private (DP) algorithms are useful for comparing against analytical bounds, or to empirically measure privacy loss in settings where known analytical bounds are not tight. However, existing privacy auditing techniques usually make strong assumptions on the adversary (e.g., knowledge of intermediate model iterates or the training data distribution), are tailored to specific tasks and model architectures, and require retraining the model many times (typically on the order of thousands). These shortcomings make deploying such techniques at scale difficult in practice, especially in federated settings where model training can take days or weeks. In this work, we present a novel "one-shot" approach that can systematically address these challenges, allowing efficient auditing or estimation of the privacy loss of a model during the same, single training run used to fit model parameters, and without requiring any a priori knowledge about the model architecture or task. We show that our method provides provably correct estimates for privacy loss under the Gaussian mechanism, and we demonstrate its performance on a well-established FL benchmark dataset under several adversarial models.
翻译:差分隐私算法的隐私估计技术对于对照分析边界或在实际中已知分析边界不紧的情况下实证测量隐私损失非常有用。然而,现有的隐私审计技术通常对对手做出强假设(例如,了解中间模型迭代或训练数据分布),针对特定任务和模型架构进行定制,并且需要多次重新训练模型(通常为数千次)。这些缺点使得在实践中大规模部署此类技术变得困难,尤其是在模型训练可能需要数天或数周的联邦环境中。在这项工作中,我们提出了一种新颖的“单次”方法,可以系统地解决这些挑战,从而在用于拟合模型参数的同一、单次训练运行中高效审计或估计模型的隐私损失,且无需任何关于模型架构或任务的先验知识。我们证明该方法在高斯机制下提供了可证明正确的隐私损失估计,并在多个对抗模型下使用一个成熟的联邦学习基准数据集展示了其性能。