Invisible watermarks safeguard images' copyrights by embedding hidden messages detectable by owners. It also prevents people from misusing images, especially those generated by AI models. Malicious adversaries can violate these rights by removing the watermarks. In order to remove watermarks without damaging the visual quality, the adversary needs to erase them while retaining the essential information in the image. This is analogous to the encoding and decoding process of generative autoencoders, especially variational autoencoders (VAEs) and diffusion models. We propose a framework using generative autoencoders to remove invisible watermarks and test it using VAEs and diffusions. Our results reveal that, even without specific training, off-the-shelf Stable Diffusion effectively removes most watermarks, surpassing all current attackers. The result underscores the vulnerabilities in existing watermarking schemes and calls for more robust methods for copyright protection.
翻译:不可见水印通过在图像中嵌入所有者可检测的隐藏信息来保护图像版权,并防止人们滥用图像,尤其是由AI模型生成的图像。恶意攻击者可能通过去除水印来侵犯这些权利。为了在去除水印的同时不损害视觉质量,攻击者需在保留图像关键信息的前提下擦除水印。这类似于生成式自编码器(尤其是变分自编码器VAEs和扩散模型)的编码与解码过程。我们提出了一种利用生成式自编码器去除不可见水印的框架,并使用VAEs和扩散模型进行了测试。结果表明,即使未经专门训练,现成的Stable Diffusion也能有效去除大部分水印,性能超越所有现有攻击方法。这一结果揭示了现有水印方案中的脆弱性,并呼吁开发更鲁棒的版权保护方法。