Deep image watermarking, which refers to enabling imperceptible watermark embedding and reliable extraction in cover images, has been shown to be effective for copyright protection of image assets. However, existing methods face limitations in simultaneously satisfying three essential criteria for generalizable watermarking: (1) invisibility (imperceptible hiding of watermarks), (2) robustness (reliable watermark recovery under diverse conditions), and (3) broad applicability (low latency in the watermarking process). To address these limitations, we propose a Hierarchical Watermark Learning (HiWL) framework, a two-stage optimization that enables a watermarking model to simultaneously achieve all three criteria. In the first stage, distribution alignment learning is designed to establish a common latent space with two constraints: (1) visual consistency between watermarked and non-watermarked images, and (2) information invariance across watermark latent representations. In this way, multimodal inputs -- including watermark messages (binary codes) and cover images (RGB pixels) -- can be effectively represented, ensuring both the invisibility of watermarks and robustness in the watermarking process. In the second stage, we employ generalized watermark representation learning to separate a unique representation of the watermark from the marked image in RGB space. Once trained, the HiWL model effectively learns generalizable watermark representations while maintaining broad applicability. Extensive experiments demonstrate the effectiveness of the proposed method. Specifically, it achieves 7.6% higher accuracy in watermark extraction compared to existing methods, while maintaining extremely low latency (processing 1000 images in 1 second).
翻译:深度图像水印技术旨在实现载体图像中水印的不可感知嵌入与可靠提取,已被证明对图像资产的版权保护具有显著效果。然而,现有方法难以同时满足可泛化水印的三个核心要求:(1) 不可见性(水印的隐蔽嵌入),(2) 鲁棒性(多种干扰下的可靠水印恢复),以及(3) 广泛适用性(水印处理过程的低延迟性)。为应对这些局限,本文提出分层水印学习框架,该框架通过两阶段优化使水印模型能同时达成所有三项标准。在第一阶段,分布对齐学习通过双重约束构建公共潜在空间:(1) 含水印图像与原始图像的视觉一致性,(2) 水印潜在表征的信息不变性。该方法能有效表征包括水印信息(二进制码)与载体图像(RGB像素)在内的多模态输入,从而确保水印的不可见性与处理过程的鲁棒性。在第二阶段,我们采用广义水印表征学习,在RGB空间中从已标记图像分离出水印的独特表征。经过训练后,HiWL模型在保持广泛适用性的同时,能有效学习可泛化的水印表征。大量实验证明了该方法的有效性:相较于现有方法,其水印提取准确率提升7.6%,同时保持极低的延迟(1秒内可处理1000张图像)。