We analyse the relationship between privacy vulnerability and dataset properties, such as examples per class and number of classes, when applying two state-of-the-art membership inference attacks (MIAs) to fine-tuned neural networks. We derive per-example MIA vulnerability in terms of score distributions and statistics computed from shadow models. We introduce a simplified model of membership inference and prove that in this model, the logarithm of the difference of true and false positive rates depends linearly on the logarithm of the number of examples per class. We complement the theoretical analysis with empirical analysis by systematically testing the practical privacy vulnerability of fine-tuning large image classification models and obtain the previously derived power law dependence between the number of examples per class in the data and the MIA vulnerability, as measured by true positive rate of the attack at a low false positive rate. Finally, we fit a parametric model of the previously derived form to predict true positive rate based on dataset properties and observe good fit for MIA vulnerability on unseen fine-tuning scenarios.
翻译:本文分析了在微调神经网络时应用两种最先进的成员推断攻击(MIA)时,隐私漏洞与数据集特性(如每类样本数和类别数量)之间的关系。我们通过影子模型计算的分数分布和统计量推导出每个样本的MIA脆弱性。我们引入了一个简化的成员推断模型,并证明在该模型中,真阳性率与假阳性率之差异的对数取决于每类样本数对数的线性函数。我们通过系统测试大型图像分类模型微调的实际隐私漏洞进行实证分析,补充了理论分析,并获得了数据中每类样本数与MIA漏洞之间先前推导出的幂律依赖关系,该关系以低假阳性率下攻击的真阳性率衡量。最后,我们拟合了一个基于先前推导形式的参数模型,以根据数据集特性预测真阳性率,并观察到该模型对未见过的微调场景中的MIA漏洞具有良好的拟合效果。