SAiW: Source-Attributable Invisible Watermarking for Proactive Deepfake Defense

Deepfakes generated by modern generative models pose a serious threat to information integrity, digital identity, and public trust. Existing detection methods are largely reactive, attempting to identify manipulations after they occur and often failing to generalize across evolving generation techniques. This motivates the need for proactive mechanisms that secure media authenticity at the time of creation. In this work, we introduce SAiW, a Source-Attributed Invisible watermarking Framework for proactive deepfake defense and media provenance verification. Unlike conventional watermarking methods that treat watermark payloads as generic signals, SAiW formulates watermark embedding as a source-conditioned representation learning problem, where watermark identity encodes the originating source and modulates the embedding process to produce discriminative and traceable signatures. The framework integrates feature-wise linear modulation to inject source identity into the embedding network, enabling scalable multi-source watermark generation. A perceptual guidance module derived from human visual system priors ensures that watermark perturbations remain visually imperceptible while maintaining robustness. In addition, a dual-purpose forensic decoder simultaneously reconstructs the embedded watermark and performs source attribution, providing both automated verification and interpretable forensic evidence. Extensive experiments across multiple deepfake datasets demonstrate that SAiW achieves high perceptual quality while maintaining strong robustness against compression, filtering, noise, geometric transformations, and adversarial perturbations. By binding digital media to its origin through invisible yet verifiable markers, SAiW enables reliable authentication and source attribution, providing a scalable foundation for proactive deepfake defense and trustworthy media provenance.

翻译：现代生成模型制造的深度伪造内容对信息完整性、数字身份可信度及公众信任构成严重威胁。现有检测方法多属被动防御——在伪造行为发生后尝试识别篡改痕迹，且难以泛化应对不断演进的生成技术。这促使学界亟需建立主动防护机制，在内容创建之初即确保媒体真实性。本文提出SAiW（可溯源隐式水印框架），通过主动防御策略实现深度伪造检测与媒体来源验证。区别于将水印载荷视为通用信号的传统方法，SAiW将水印嵌入建模为基于源条件表征的学习问题：水印标识编码原始来源信息，通过调制嵌入过程生成具区分性与可溯源性的签名。该框架采用特征级线性调制将源身份信息注入嵌入网络，实现可扩展的多源水印生成；基于人类视觉系统先验构建的感知引导模块，确保水印扰动在保持鲁棒性的同时实现视觉不可察觉性。此外，双用途取证解码器可同时重构嵌入水印并执行来源溯源，提供自动化验证与可解释取证证据。跨多个深度伪造数据集的实验表明，SAiW在维持高感知质量的同时，对压缩、滤波、噪声、几何变换及对抗性扰动均保持强鲁棒性。通过不可见但可验证的标记将数字媒体与其来源绑定，SAiW实现了可靠的身份认证与溯源，为主动深度伪造防御及可信媒体溯源提供了可扩展的基础架构。