The concept of differential privacy (DP) can quantitatively measure privacy loss by observing the changes in the distribution caused by the inclusion of individuals in the target dataset. The DP, which is generally used as a constraint, has been prominent in safeguarding datasets in machine learning in industry giants like Apple and Google. A common methodology for guaranteeing DP is incorporating appropriate noise into query outputs, thereby establishing statistical defense systems against privacy attacks such as membership inference and linkage attacks. However, especially for small datasets, existing DP mechanisms occasionally add excessive amount of noise to query output, thereby discarding data utility. This is because the traditional DP computes privacy loss based on the worst-case scenario, i.e., statistical outliers. In this work, to tackle this challenge, we utilize per-instance DP (pDP) as a constraint, measuring privacy loss for each data instance and optimizing noise tailored to individual instances. In a nutshell, we propose a per-instance noise variance optimization (NVO) game, framed as a common interest sequential game, and show that the Nash equilibrium (NE) points of it inherently guarantee pDP for all data instances. Through extensive experiments, our proposed pDP algorithm demonstrated an average performance improvement of up to 99.53% compared to the conventional DP algorithm in terms of KL divergence.
翻译:差分隐私(DP)概念通过观测目标数据集中纳入个体所引起的分布变化,可定量衡量隐私损失。作为常用约束条件的DP在机器学习领域已被苹果、谷歌等行业巨头用于保护数据集。保障DP的常见方法是在查询输出中注入适当噪声,从而建立针对成员推理攻击和链接攻击等隐私攻击的统计防御体系。然而,现有DP机制(尤其针对小数据集)有时会向查询输出添加过量噪声,导致数据效用降低。这是因为传统DP基于最坏情况(即统计离群点)计算隐私损失。为应对这一挑战,本研究采用实例级差分隐私(pDP)作为约束条件,度量每个数据实例的隐私损失并针对个体实例优化噪声。简言之,我们提出了一种实例级噪声方差优化(NVO)博弈,该博弈被构建为共同利益序贯博弈,并证明其纳什均衡(NE)点能够为所有数据实例天然保障pDP。通过大量实验,我们提出的pDP算法在KL散度指标上相比传统DP算法平均性能提升最高达99.53%。