Recent developments in text-conditioned image generative models have revolutionized the production of realistic results. Unfortunately, this has also led to an increase in privacy violations and the spread of false information, which requires the need for traceability, privacy protection, and other security measures. However, existing text-to-image paradigms lack the technical capabilities to link traceable messages with image generation. In this study, we introduce a novel task for the joint generation of text to image and watermark (T2IW). This T2IW scheme ensures minimal damage to image quality when generating a compound image by forcing the semantic feature and the watermark signal to be compatible in pixels. Additionally, by utilizing principles from Shannon information theory and non-cooperative game theory, we are able to separate the revealed image and the revealed watermark from the compound image. Furthermore, we strengthen the watermark robustness of our approach by subjecting the compound image to various post-processing attacks, with minimal pixel distortion observed in the revealed watermark. Extensive experiments have demonstrated remarkable achievements in image quality, watermark invisibility, and watermark robustness, supported by our proposed set of evaluation metrics.
翻译:近年来,基于文本条件的图像生成模型在逼真图像生成方面取得了革命性进展。然而,这也导致隐私侵犯和虚假信息传播问题日益严重,亟需可追溯性、隐私保护等安全机制。现有文本到图像范式缺乏将可追溯信息与图像生成相关联的技术能力。本研究提出一种新型文本到图像与水印联合生成任务(T2IW)。该方案通过强制语义特征与水印信号在像素层面兼容,在生成复合图像时将图像质量损伤降至最低。同时,利用香农信息论和非合作博弈理论原理,我们成功从复合图像中分离出显式图像和显式水印。此外,通过对复合图像施加多种后处理攻击,我们的方法增强了水印鲁棒性,在提取水印中仅观察到微小像素失真。大量实验表明,在图像质量、水印不可见性和水印鲁棒性方面取得了显著成果,所提出的评估指标体系为此提供了有力支撑。