Deep learning-based image watermarking, while robust against conventional distortions, remains vulnerable to advanced adversarial and regeneration attacks. Conventional countermeasures, which jointly optimize the encoder and decoder via a noise layer, face 2 inevitable challenges: (1) decrease of clean accuracy due to decoder adversarial training and (2) limited robustness due to simultaneous training of all three advanced attacks. To overcome these issues, we propose AdvMark, a novel two-stage fine-tuning framework that decouples the defense strategies. In stage 1, we address adversarial vulnerability via a tailored adversarial training paradigm that primarily fine-tunes the encoder while only conditionally updating the decoder. This approach learns to move the image into a non-attackable region, rather than modifying the decision boundary, thus preserving clean accuracy. In stage 2, we tackle distortion and regeneration attacks via direct image optimization. To preserve the adversarial robustness gained in stage 1, we formulate a principled, constrained image loss with theoretical guarantees, which balances the deviation from cover and previous encoded images. We also propose a quality-aware early-stop to further guarantee the lower bound of visual quality. Extensive experiments demonstrate AdvMark outperforms with the highest image quality and comprehensive robustness, i.e. up to 29\%, 33\% and 46\% accuracy improvement for distortion, regeneration and adversarial attacks, respectively.
翻译:基于深度学习的图像水印技术虽然对常规失真具有鲁棒性,但在面对高级对抗性攻击和再生攻击时仍然脆弱。传统的防御方法通过噪声层联合优化编码器和解码器,面临两个不可避免的挑战:(1) 解码器的对抗训练导致干净准确率下降;(2) 同时训练所有三种高级攻击导致鲁棒性有限。为克服这些问题,我们提出了AdvMark——一种新颖的两阶段微调框架,实现了防御策略的解耦。在第一阶段,我们通过一种定制的对抗训练范式解决对抗脆弱性问题,该范式主要微调编码器,仅在有条件时更新解码器。这种方法学习将图像移动到不可攻击区域,而非修改决策边界,从而保持了干净准确率。在第二阶段,我们通过直接图像优化处理失真和再生攻击。为保留第一阶段获得的对抗鲁棒性,我们构建了一个具有理论保证的原则性约束图像损失函数,该函数平衡了与原始载体图像及先前编码图像的偏差。我们还提出了一种质量感知的早停策略,以进一步保证视觉质量的下界。大量实验表明,AdvMark在图像质量和综合鲁棒性方面均表现优异,即在失真、再生和对抗性攻击下,准确率分别提升了最高29%、33%和46%。