Diffusion models have recently gained significant attention in both academia and industry due to their impressive generative performance in terms of both sampling quality and distribution coverage. Accordingly, proposals are made for sharing pre-trained diffusion models across different organizations, as a way of improving data utilization while enhancing privacy protection by avoiding sharing private data directly. However, the potential risks associated with such an approach have not been comprehensively examined. In this paper, we take an adversarial perspective to investigate the potential privacy and fairness risks associated with the sharing of diffusion models. Specifically, we investigate the circumstances in which one party (the sharer) trains a diffusion model using private data and provides another party (the receiver) black-box access to the pre-trained model for downstream tasks. We demonstrate that the sharer can execute fairness poisoning attacks to undermine the receiver's downstream models by manipulating the training data distribution of the diffusion model. Meanwhile, the receiver can perform property inference attacks to reveal the distribution of sensitive features in the sharer's dataset. Our experiments conducted on real-world datasets demonstrate remarkable attack performance on different types of diffusion models, which highlights the critical importance of robust data auditing and privacy protection protocols in pertinent applications.
翻译:扩散模型近年来因其在采样质量和分布覆盖度方面出色的生成性能,在学术界和工业界均获得了显著关注。为此,人们提出跨不同组织共享预训练扩散模型,作为提升数据利用率、同时避免直接共享私有数据以增强隐私保护的方案。然而,此类方法伴随的潜在风险尚未得到全面审视。本文从对抗性视角出发,系统研究共享扩散模型可能引发的隐私与公平风险。具体而言,我们探讨了一方(共享方)使用私有数据训练扩散模型,并向另一方(接收方)提供预训练模型黑盒访问权限以执行下游任务的场景。我们证明:共享方可通过操纵扩散模型的训练数据分布实施公平性投毒攻击,损害接收方的下游模型;同时,接收方可通过属性推理攻击揭示共享方数据集中敏感特征的分布规律。基于真实数据集的实验表明,我们的攻击方法在不同类型扩散模型上均展现出显著的攻击效能,这凸显了在相关应用中建立稳健数据审计与隐私保护协议的关键重要性。