Large-scale pre-trained models are increasingly adapted to downstream tasks through a new paradigm called prompt learning. In contrast to fine-tuning, prompt learning does not update the pre-trained model's parameters. Instead, it only learns an input perturbation, namely prompt, to be added to the downstream task data for predictions. Given the fast development of prompt learning, a well-generalized prompt inevitably becomes a valuable asset as significant effort and proprietary data are used to create it. This naturally raises the question of whether a prompt may leak the proprietary information of its training data. In this paper, we perform the first comprehensive privacy assessment of prompts learned by visual prompt learning through the lens of property inference and membership inference attacks. Our empirical evaluation shows that the prompts are vulnerable to both attacks. We also demonstrate that the adversary can mount a successful property inference attack with limited cost. Moreover, we show that membership inference attacks against prompts can be successful with relaxed adversarial assumptions. We further make some initial investigations on the defenses and observe that our method can mitigate the membership inference attacks with a decent utility-defense trade-off but fails to defend against property inference attacks. We hope our results can shed light on the privacy risks of the popular prompt learning paradigm. To facilitate the research in this direction, we will share our code and models with the community.
翻译:大规模预训练模型正日益通过一种名为提示学习的新范式适应下游任务。与微调不同,提示学习不更新预训练模型的参数,而是仅学习一个输入扰动(即提示)添加到下游任务数据中用于预测。鉴于提示学习的快速发展,由于创建通用提示需投入大量精力和专有数据,它不可避免地成为宝贵资产。这自然引发一个问题:提示是否会泄露其训练数据的专有信息?本文首次通过属性推断和成员推断攻击的视角,对视觉提示学习所学提示进行全面隐私评估。实验评估表明,提示对两种攻击均存在脆弱性。我们还证明,攻击者能以有限成本成功发起属性推断攻击。此外,我们展示了对提示的成员推断攻击可在宽松的敌手假设下成功。我们进一步对防御措施进行了初步研究,观察到我们的方法能在良好的效用-防御权衡下缓解成员推断攻击,但无法防御属性推断攻击。我们希望研究结果能揭示这一流行提示学习范式的隐私风险。为促进该方向研究,我们将向社区共享代码与模型。