With the rise of AI-generated audio, watermarking has become widely used for detecting misuse and protecting intellectual property. However, adversaries may try to remove these watermarks, making it critical to evaluate how well watermarking schemes withstand removal attacks. Existing attacks are often impractical: they either noticeably degrade perceptual quality or require access to the watermarking scheme. We propose DiffErase, a black-box watermark removal attack that assumes no knowledge of the target watermarking scheme while maintaining perceptual quality. DiffErase perturbs watermarked audio to an intermediate diffusion noise level and regenerates it using a pretrained denoising model, effectively suppressing watermark signals. Theoretical analysis and extensive experiments demonstrate that inaudible audio watermarks are highly vulnerable: across multiple audio domains, DiffErase consistently removes watermarks while preserving perceptual quality. These findings highlight the need for future audio watermarking designs to consider diffusion-based threats. Code and demos are available at https://differase.github.io/DiffErase/.
翻译:随着AI生成音频的兴起,水印技术已被广泛用于检测滥用和保护知识产权。然而,攻击者可能试图移除这些水印,因此评估水印方案抵御移除攻击的能力至关重要。现有攻击往往不切实际:它们要么显著降低感知质量,要么需要访问水印方案。我们提出DiffErase,一种黑盒水印移除攻击,该攻击无需了解目标水印方案,同时保持感知质量。DiffErase将含水印音频扰动至中间扩散噪声水平,并使用预训练去噪模型重建,从而有效抑制水印信号。理论分析和大量实验表明,不可听音频水印极易受到攻击:在多个音频域中,DiffErase在保持感知质量的同时始终移除水印。这些发现凸显了未来音频水印设计需考虑基于扩散的威胁。代码和演示可在https://differase.github.io/DiffErase/获取。