Exploring Privacy and Fairness Risks in Sharing Diffusion Models: An Adversarial Perspective

Diffusion models have recently gained significant attention in both academia and industry due to their impressive generative performance in terms of both sampling quality and distribution coverage. Accordingly, proposals are made for sharing pre-trained diffusion models across different organizations, as a way of improving data utilization while enhancing privacy protection by avoiding sharing private data directly. However, the potential risks associated with such an approach have not been comprehensively examined. In this paper, we take an adversarial perspective to investigate the potential privacy and fairness risks associated with the sharing of diffusion models. Specifically, we investigate the circumstances in which one party (the sharer) trains a diffusion model using private data and provides another party (the receiver) black-box access to the pre-trained model for downstream tasks. We demonstrate that the sharer can execute fairness poisoning attacks to undermine the receiver's downstream models by manipulating the training data distribution of the diffusion model. Meanwhile, the receiver can perform property inference attacks to reveal the distribution of sensitive features in the sharer's dataset. Our experiments conducted on real-world datasets demonstrate remarkable attack performance on different types of diffusion models, which highlights the critical importance of robust data auditing and privacy protection protocols in pertinent applications.

翻译：扩散模型近年来因其在采样质量和分布覆盖度方面出色的生成性能，在学术界和工业界均获得了显著关注。为此，人们提出跨不同组织共享预训练扩散模型，作为提升数据利用率、同时避免直接共享私有数据以增强隐私保护的方案。然而，此类方法伴随的潜在风险尚未得到全面审视。本文从对抗性视角出发，系统研究共享扩散模型可能引发的隐私与公平风险。具体而言，我们探讨了一方（共享方）使用私有数据训练扩散模型，并向另一方（接收方）提供预训练模型黑盒访问权限以执行下游任务的场景。我们证明：共享方可通过操纵扩散模型的训练数据分布实施公平性投毒攻击，损害接收方的下游模型；同时，接收方可通过属性推理攻击揭示共享方数据集中敏感特征的分布规律。基于真实数据集的实验表明，我们的攻击方法在不同类型扩散模型上均展现出显著的攻击效能，这凸显了在相关应用中建立稳健数据审计与隐私保护协议的关键重要性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日