Unlearnable Examples for Diffusion Models: Protect Data from Unauthorized Exploitation

Diffusion models have demonstrated remarkable performance in image generation tasks, paving the way for powerful AIGC applications. However, these widely-used generative models can also raise security and privacy concerns, such as copyright infringement, and sensitive data leakage. To tackle these issues, we propose a method, Unlearnable Diffusion Perturbation, to safeguard images from unauthorized exploitation. Our approach involves designing an algorithm to generate sample-wise perturbation noise for each image to be protected. This imperceptible protective noise makes the data almost unlearnable for diffusion models, i.e., diffusion models trained or fine-tuned on the protected data cannot generate high-quality and diverse images related to the protected training data. Theoretically, we frame this as a max-min optimization problem and introduce EUDP, a noise scheduler-based method to enhance the effectiveness of the protective noise. We evaluate our methods on both Denoising Diffusion Probabilistic Model and Latent Diffusion Models, demonstrating that training diffusion models on the protected data lead to a significant reduction in the quality of the generated images. Especially, the experimental results on Stable Diffusion demonstrate that our method effectively safeguards images from being used to train Diffusion Models in various tasks, such as training specific objects and styles. This achievement holds significant importance in real-world scenarios, as it contributes to the protection of privacy and copyright against AI-generated content.

翻译：扩散模型在图像生成任务中展现出卓越性能，为强大的AIGC应用铺平了道路。然而，这些广泛使用的生成模型也可能引发安全与隐私问题，例如版权侵犯和敏感数据泄露。为解决这些问题，我们提出一种名为"不可学习扩散扰动"的方法来保护图像免遭未经授权的利用。我们的方法通过设计算法为每张待保护图像生成样本级扰动噪声。这种难以察觉的保护性噪声使数据对扩散模型几乎不可学习，即在受保护数据上训练或微调的扩散模型无法生成与受保护训练数据相关的高质量多样化图像。理论上，我们将此问题构建为最大最小优化问题，并提出基于噪声调度器的EUDP方法以增强保护性噪声的有效性。我们在去噪扩散概率模型和潜在扩散模型上评估了该方法，结果表明在受保护数据上训练扩散模型会导致生成图像质量显著下降。特别是在Stable Diffusion上的实验证明，我们的方法能有效防止图像被用于训练各类任务中的扩散模型，例如特定对象和风格的训练。这一成果在实际场景中具有重要意义，因为它有助于在AI生成内容时代保护隐私与版权。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日