Impact of Dataset Properties on Membership Inference Vulnerability of Deep Transfer Learning

We analyse the relationship between privacy vulnerability and dataset properties, such as examples per class and number of classes, when applying two state-of-the-art membership inference attacks (MIAs) to fine-tuned neural networks. We derive per-example MIA vulnerability in terms of score distributions and statistics computed from shadow models. We introduce a simplified model of membership inference and prove that in this model, the logarithm of the difference of true and false positive rates depends linearly on the logarithm of the number of examples per class. We complement the theoretical analysis with empirical analysis by systematically testing the practical privacy vulnerability of fine-tuning large image classification models and obtain the previously derived power law dependence between the number of examples per class in the data and the MIA vulnerability, as measured by true positive rate of the attack at a low false positive rate. Finally, we fit a parametric model of the previously derived form to predict true positive rate based on dataset properties and observe good fit for MIA vulnerability on unseen fine-tuning scenarios.

翻译：本文分析了在微调神经网络时应用两种最先进的成员推断攻击（MIA）时，隐私漏洞与数据集特性（如每类样本数和类别数量）之间的关系。我们通过影子模型计算的分数分布和统计量推导出每个样本的MIA脆弱性。我们引入了一个简化的成员推断模型，并证明在该模型中，真阳性率与假阳性率之差异的对数取决于每类样本数对数的线性函数。我们通过系统测试大型图像分类模型微调的实际隐私漏洞进行实证分析，补充了理论分析，并获得了数据中每类样本数与MIA漏洞之间先前推导出的幂律依赖关系，该关系以低假阳性率下攻击的真阳性率衡量。最后，我们拟合了一个基于先前推导形式的参数模型，以根据数据集特性预测真阳性率，并观察到该模型对未见过的微调场景中的MIA漏洞具有良好的拟合效果。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日