Vanishing Watermarks: Diffusion-Based Image Editing Undermines Robust Invisible Watermarking

Robust invisible watermarking schemes aim to embed hidden information into images such that the watermark survives common manipulations. However, powerful diffusion-based image generation and editing techniques now pose a new threat to these watermarks. In this paper, we present a comprehensive theoretical and empirical analysis demonstrating that diffusion models can effectively erase robust watermarks even when those watermarks were designed to withstand conventional distortions. We show that a diffusion-driven image regeneration process, which leverages generative models to recreate an image, can remove embedded watermarks while preserving the image's perceptual content. Furthermore, we introduce a guided diffusion-based attack that explicitly targets the embedded watermark signal during generation, significantly degrading watermark detectability. Theoretically, we prove that as an image undergoes sufficient diffusion transformations, the mutual information between the watermarked image and the hidden payload approaches zero, leading to inevitable decoding failure. Experimentally, we evaluate multiple state-of-the-art watermarking methods (including deep learning-based schemes like StegaStamp, TrustMark, and VINE) and demonstrate that diffusion edits yield near-zero watermark recovery rates after attack, while maintaining high visual fidelity of the regenerated images. Our findings reveal a fundamental vulnerability in current robust watermarking techniques against generative model-based edits, underscoring the need for new strategies to ensure watermark resilience in the era of powerful diffusion models.

翻译：鲁棒性不可见水印方案旨在将隐藏信息嵌入图像中，使得水印能够经受常见的图像处理操作。然而，当前强大的基于扩散的图像生成与编辑技术对这些水印构成了新的威胁。本文通过全面的理论与实证分析证明，扩散模型能够有效擦除鲁棒水印，即使这些水印原本设计用于抵抗传统失真。我们表明，一种利用生成模型重建图像的扩散驱动图像再生过程，可以在保持图像感知内容的同时移除嵌入的水印。此外，我们提出一种基于引导扩散的攻击方法，该攻击在生成过程中显式针对嵌入的水印信号，从而显著降低水印的可检测性。理论上，我们证明当图像经历充分的扩散变换时，含水印图像与隐藏载荷之间的互信息趋近于零，从而导致不可避免的解码失败。实验上，我们评估了多种先进的水印方法（包括基于深度学习的方案，如 StegaStamp、TrustMark 和 VINE），并证明经过扩散编辑攻击后，水印恢复率接近零，同时再生图像保持较高的视觉保真度。我们的研究揭示了当前鲁棒水印技术在面对基于生成模型的编辑时存在根本性脆弱点，这强调了在强大扩散模型时代需要新的策略以确保水印的鲁棒性。