Watermarking plays a key role in the provenance and detection of AI-generated content. While existing methods prioritize robustness against real-world distortions (e.g., JPEG compression and noise addition), we reveal a fundamental tradeoff: such robust watermarks inherently improve the redundancy of detectable patterns encoded into images, creating exploitable information leakage. To leverage this, we propose an attack framework that extracts leakage of watermark patterns through multi-channel feature learning using a pre-trained vision model. Unlike prior works requiring massive data or detector access, our method achieves both forgery and detection evasion with a single watermarked image. Extensive experiments demonstrate that our method achieves a 60\% success rate gain in detection evasion and 51\% improvement in forgery accuracy compared to state-of-the-art methods while maintaining visual fidelity. Our work exposes the robustness-stealthiness paradox: current "robust" watermarks sacrifice security for distortion resistance, providing insights for future watermark design.
翻译:水印技术在AI生成内容的溯源与检测中起着关键作用。尽管现有方法优先考虑对现实世界失真(如JPEG压缩和噪声添加)的鲁棒性,但我们揭示了一个根本性的权衡:此类鲁棒水印本质上提高了嵌入图像中的可检测模式的冗余度,从而产生了可利用的信息泄露。为利用此特性,我们提出一种攻击框架,通过使用预训练视觉模型进行多通道特征学习来提取水印模式的泄露信息。与先前需要海量数据或检测器访问权限的工作不同,我们的方法仅需单张含水印图像即可同时实现伪造和检测规避。大量实验表明,相较于最先进的方法,我们的方法在检测规避方面实现了60%的成功率提升,在伪造准确率上提高了51%,同时保持了视觉保真度。我们的工作揭示了鲁棒性与隐蔽性悖论:当前的“鲁棒”水印为抵抗失真而牺牲了安全性,这为未来水印设计提供了重要启示。