Exploring Privacy and Fairness Risks in Sharing Diffusion Models: An Adversarial Perspective

Diffusion models have recently gained significant attention in both academia and industry due to their impressive generative performance in terms of both sampling quality and distribution coverage. Accordingly, proposals are made for sharing pre-trained diffusion models across different organizations, as a way of improving data utilization while enhancing privacy protection by avoiding sharing private data directly. However, the potential risks associated with such an approach have not been comprehensively examined. In this paper, we take an adversarial perspective to investigate the potential privacy and fairness risks associated with the sharing of diffusion models. Specifically, we investigate the circumstances in which one party (the sharer) trains a diffusion model using private data and provides another party (the receiver) black-box access to the pre-trained model for downstream tasks. We demonstrate that the sharer can execute fairness poisoning attacks to undermine the receiver's downstream models by manipulating the training data distribution of the diffusion model. Meanwhile, the receiver can perform property inference attacks to reveal the distribution of sensitive features in the sharer's dataset. Our experiments conducted on real-world datasets demonstrate remarkable attack performance on different types of diffusion models, which highlights the critical importance of robust data auditing and privacy protection protocols in pertinent applications.

翻译：近年来，扩散模型因其在采样质量和分布覆盖方面卓越的生成性能，在学术界和工业界获得了广泛关注。因此，有研究者提出跨组织共享预训练扩散模型的方案，以期在避免直接共享私有数据以增强隐私保护的同时提高数据利用率。然而，此类方法可能带来的潜在风险尚未得到全面审视。本文从对抗视角出发，系统研究了扩散模型共享过程中可能存在的隐私与公平性风险。具体而言，我们探讨了以下场景：一方（共享方）使用私有数据训练扩散模型，并向另一方（接收方）提供对该预训练模型的黑盒访问权限以完成下游任务。研究表明，共享方可通过操纵扩散模型的训练数据分布实施公平性投毒攻击，从而破坏接收方的下游模型；与此同时，接收方可通过属性推断攻击揭示共享方数据集中敏感特征的分布特征。我们在真实数据集上开展的实验表明，针对不同类型扩散模型的攻击均展现出显著的攻击性能，这凸显了在相关应用中建立稳健数据审计与隐私保护协议的迫切重要性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日