TMI! Finetuned Models Leak Private Information from their Pretraining Data

Transfer learning has become an increasingly popular technique in machine learning as a way to leverage a pretrained model trained for one task to assist with building a finetuned model for a related task. This paradigm has been especially popular for $\textit{privacy}$ in machine learning, where the pretrained model is considered public, and only the data for finetuning is considered sensitive. However, there are reasons to believe that the data used for pretraining is still sensitive, making it essential to understand how much information the finetuned model leaks about the pretraining data. In this work we propose a new membership-inference threat model where the adversary only has access to the finetuned model and would like to infer the membership of the pretraining data. To realize this threat model, we implement a novel metaclassifier-based attack, $\textbf{TMI}$, that leverages the influence of memorized pretraining samples on predictions in the downstream task. We evaluate $\textbf{TMI}$ on both vision and natural language tasks across multiple transfer learning settings, including finetuning with differential privacy. Through our evaluation, we find that $\textbf{TMI}$ can successfully infer membership of pretraining examples using query access to the finetuned model. An open-source implementation of $\textbf{TMI}$ can be found on GitHub: https://github.com/johnmath/tmi-pets24.

翻译：迁移学习作为一种机器学习技术日益普及，它通过利用针对某一任务训练的预训练模型来辅助构建相关任务的微调模型。这一范式在机器学习$\textit{隐私}$领域尤为流行，其中预训练模型被视为公开，仅微调所用的数据被视为敏感。然而，有理由认为预训练数据本身仍具敏感性，因此必须理解微调模型在多大程度上泄露了预训练数据的信息。在本工作中，我们提出了一种新的成员推理威胁模型：攻击者仅能访问微调模型，并试图推断预训练数据的成员身份。为实现此威胁模型，我们实现了一种基于元分类器的新型攻击方法$\textbf{TMI}$，该方法利用被记忆的预训练样本对下游任务预测的影响进行推理。我们在视觉和自然语言任务上评估$\textbf{TMI}$，涵盖多种迁移学习场景（包括差分隐私微调）。评估结果表明，$\textbf{TMI}$能够通过查询访问微调模型成功推断预训练样本的成员身份。$\textbf{TMI}$的开源实现可在GitHub获取：https://github.com/johnmath/tmi-pets24。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/