Text-to-image generative models based on latent diffusion models (LDM) have demonstrated their outstanding ability in generating high-quality and high-resolution images according to language prompt. Based on these powerful latent diffusion models, various fine-tuning methods have been proposed to achieve the personalization of text-to-image diffusion models such as artistic style adaptation and human face transfer. However, the unauthorized usage of data for model personalization has emerged as a prevalent concern in relation to copyright violations. For example, a malicious user may use the fine-tuning technique to generate images which mimic the style of a painter without his/her permission. In light of this concern, we have proposed FT-Shield, a watermarking approach specifically designed for the fine-tuning of text-to-image diffusion models to aid in detecting instances of infringement. We develop a novel algorithm for the generation of the watermark to ensure that the watermark on the training images can be quickly and accurately transferred to the generated images of text-to-image diffusion models. A watermark will be detected on an image by a binary watermark detector if the image is generated by a model that has been fine-tuned using the protected watermarked images. Comprehensive experiments were conducted to validate the effectiveness of FT-Shield.
翻译:基于潜在扩散模型的文本到图像生成模型在根据语言提示生成高质量高分辨率图像方面展现出卓越能力。依托这些强大的潜在扩散模型,研究者提出了多种微调方法(如艺术风格适配与人脸迁移)以实现文本到图像扩散模型的个性化定制。然而,未经授权使用数据进行模型个性化定制已引发普遍的版权侵权担忧。例如,恶意用户可能利用微调技术未经授权生成模仿画家风格的图像。针对这一隐患,我们提出FT-Shield——一种专为文本到图像扩散模型微调设计的水印方案,用于辅助侵权检测。我们开发了新型水印生成算法,确保训练图像中的水印能快速精准地迁移至文本到图像扩散模型生成的图像中。通过二元水印检测器,当待检测图像由使用受保护水印图像微调后的模型生成时,即可在图像上检测到水印信号。综合实验验证了FT-Shield的有效性。