Invisible Image Watermarking is crucial for ensuring content provenance and accountability in generative AI. While Gen-AI providers are increasingly integrating invisible watermarking systems, the robustness of these schemes against forgery attacks remains poorly characterized. This is critical, as forging traceable watermarks onto illicit content leads to false attribution, potentially harming the reputation and legal standing of Gen-AI service providers who are not responsible for the content. In this work, we propose WMCopier, an effective watermark forgery attack that operates without requiring any prior knowledge of or access to the target watermarking algorithm. Our approach first models the target watermark distribution using an unconditional diffusion model, and then seamlessly embeds the target watermark into a non-watermarked image via a shallow inversion process. We also incorporate an iterative optimization procedure that refines the reconstructed image to further trade off the fidelity and forgery efficiency. Experimental results demonstrate that WMCopier effectively deceives both open-source and closed-source watermark systems (e.g., Amazon's system), achieving a significantly higher success rate than existing methods. Additionally, we evaluate the robustness of forged samples and discuss the potential defenses against our attack.
翻译:不可见图像水印技术对于保障生成式人工智能的内容溯源与责任认定至关重要。尽管生成式AI服务商正日益广泛地部署不可见水印系统,但这些方案抵御伪造攻击的鲁棒性仍缺乏充分评估。这一问题极为关键,因为将可追溯水印伪造至非法内容会导致错误归因,可能损害非责任方生成式AI服务商的声誉与法律地位。本研究提出WMCopier——一种无需预先获知或访问目标水印算法即可实施的有效水印伪造攻击方法。我们的方案首先通过无条件扩散模型对目标水印分布进行建模,继而通过浅层反演过程将目标水印无缝嵌入未含水印的图像。我们还引入了迭代优化流程,通过精修重建图像在保真度与伪造效率间实现进一步权衡。实验结果表明,WMCopier能有效欺骗开源与闭源水印系统(如亚马逊系统),其成功率显著超越现有方法。此外,我们评估了伪造样本的鲁棒性,并探讨了针对本攻击的潜在防御策略。