Dual Defense: Adversarial, Traceable, and Invisible Robust Watermarking against Face Swapping

The malicious applications of deep forgery, represented by face swapping, have introduced security threats such as misinformation dissemination and identity fraud. While some research has proposed the use of robust watermarking methods to trace the copyright of facial images for post-event traceability, these methods cannot effectively prevent the generation of forgeries at the source and curb their dissemination. To address this problem, we propose a novel comprehensive active defense mechanism that combines traceability and adversariality, called Dual Defense. Dual Defense invisibly embeds a single robust watermark within the target face to actively respond to sudden cases of malicious face swapping. It disrupts the output of the face swapping model while maintaining the integrity of watermark information throughout the entire dissemination process. This allows for watermark extraction at any stage of image tracking for traceability. Specifically, we introduce a watermark embedding network based on original-domain feature impersonation attack. This network learns robust adversarial features of target facial images and embeds watermarks, seeking a well-balanced trade-off between watermark invisibility, adversariality, and traceability through perceptual adversarial encoding strategies. Extensive experiments demonstrate that Dual Defense achieves optimal overall defense success rates and exhibits promising universality in anti-face swapping tasks and dataset generalization ability. It maintains impressive adversariality and traceability in both original and robust settings, surpassing current forgery defense methods that possess only one of these capabilities, including CMUA-Watermark, Anti-Forgery, FakeTagger, or PGD methods.

翻译：以人脸伪造为代表的深度伪造恶意应用已引发虚假信息传播、身份欺诈等安全威胁。现有研究虽提出利用鲁棒水印方法追踪人脸图像版权以实现事后溯源，但这类方法无法有效在源头阻止伪造内容的生成与传播。针对该问题，我们提出一种融合可追溯性与对抗性的新型综合主动防御机制——双重防御（Dual Defense）。该方法通过将单个鲁棒水印不可见地嵌入目标人脸，主动应对突发性恶意换脸攻击，在维护水印信息完整性的同时破坏换脸模型的输出结果，从而实现全传播过程中任意阶段的图像溯源。具体而言，我们提出基于原域特征冒充攻击的水印嵌入网络。该网络学习目标人脸图像的鲁棒对抗特征并嵌入水印，通过感知对抗编码策略在水印不可见性、对抗性与可追溯性之间寻求最优平衡。大量实验表明，双重防御在总体防御成功率上达到最优，在抗人脸伪造任务与数据集泛化能力上展现出优异的普适性。该方案在原始场景与鲁棒场景下均保持出色的对抗性与可追溯性，全面超越了当前仅具备单一防御能力的伪造防御方法（包括CMUA-Watermark、Anti-Forgery、FakeTagger及PGD方法）。