Text-to-image diffusion models have emerged as an evolutionary for producing creative content in image synthesis. Based on the impressive generation abilities of these models, instruction-guided diffusion models can edit images with simple instructions and input images. While they empower users to obtain their desired edited images with ease, they have raised concerns about unauthorized image manipulation. Prior research has delved into the unauthorized use of personalized diffusion models; however, this problem of instruction-guided diffusion models remains largely unexplored. In this paper, we first propose a protection method EditShield against unauthorized modifications from such models. Specifically, EditShield works by adding imperceptible perturbations that can shift the latent representation used in the diffusion process, forcing models to generate unrealistic images with mismatched subjects. Our extensive experiments demonstrate EditShield's effectiveness among synthetic and real-world datasets. Besides, EditShield also maintains robustness against various editing types and synonymous instruction phrases.
翻译:文本到图像扩散模型已成为图像合成领域创造性内容生产的革命性技术。基于这些模型出色的生成能力,指令引导扩散模型可通过简单指令和输入图像实现图像编辑。尽管这些模型使用户能够轻松获得理想的编辑图像,但也引发了关于未经授权图像操控的担忧。先前研究已涉及个性化扩散模型的未经授权使用问题,但指令引导扩散模型中的此类问题尚未得到充分探索。本文首次提出一种名为EditShield的保护方法,以抵御此类模型的未经授权修改。具体而言,EditShield通过添加不可察觉的扰动,改变扩散过程中使用的潜在表示,迫使模型生成主体不匹配的非真实图像。大量实验证明,EditShield在合成数据集和真实数据集上均具有有效性。此外,EditShield还能保持对多种编辑类型和同义指令短语的鲁棒性。