Auditing mechanisms for differential privacy use probabilistic means to empirically estimate the privacy level of an algorithm. For private machine learning, existing auditing mechanisms are tight: the empirical privacy estimate (nearly) matches the algorithm's provable privacy guarantee. But these auditing techniques suffer from two limitations. First, they only give tight estimates under implausible worst-case assumptions (e.g., a fully adversarial dataset). Second, they require thousands or millions of training runs to produce non-trivial statistical estimates of the privacy leakage. This work addresses both issues. We design an improved auditing scheme that yields tight privacy estimates for natural (not adversarially crafted) datasets -- if the adversary can see all model updates during training. Prior auditing works rely on the same assumption, which is permitted under the standard differential privacy threat model. This threat model is also applicable, e.g., in federated learning settings. Moreover, our auditing scheme requires only two training runs (instead of thousands) to produce tight privacy estimates, by adapting recent advances in tight composition theorems for differential privacy. We demonstrate the utility of our improved auditing schemes by surfacing implementation bugs in private machine learning code that eluded prior auditing techniques.
翻译:差分隐私的审计机制利用概率方法经验性地估计算法的隐私保护水平。针对隐私机器学习,现有审计机制存在紧致性:经验隐私估计(几乎)与算法的可证明隐私保障相匹配。但这些审计技术存在两个局限。首先,它们仅在非现实的极端假设(例如完全对抗性数据集)下才能给出紧致估计。其次,需要对模型进行数千次乃至数百万次训练才能获得有统计意义的隐私泄露估计。本文同时解决了这两个问题。我们设计了一种改进的审计方案,当攻击者能够观察训练期间所有模型更新时,该方案能为自然数据集(非对抗性构造)提供紧致隐私估计。此前的审计研究同样依赖该假设,而该假设在标准差分隐私威胁模型下是被允许的。该威胁模型同样适用于联邦学习等场景。此外,通过借鉴差分隐私紧致组合定理的最新进展,我们的审计方案仅需两次训练(而非数千次)即可产生紧致隐私估计。通过发现此前审计技术未能察觉的隐私机器学习代码实现漏洞,我们展示了改进后审计方案的有效性。