The proliferation of generative image models has revolutionized AIGC creation while amplifying concerns over content provenance and manipulation forensics. Existing methods are typically either unable to localize tampering or restricted to specific generative settings, limiting their practical utility. We propose \textbf{GenPTW}, a \textbf{Gen}eral watermarking framework that unifies \textbf{P}rovenance tracing and \textbf{T}amper localization in latent space. It supports both in-generation and post-generation embedding without altering the generative process, and is plug-and-play compatible with latent diffusion models (LDMs) and visual autoregressive (VAR) models. To achieve precise provenance tracing and tamper localization, we embed the watermark using two complementary mechanisms: cross-attention fusion aligned with latent semantics and spatial fusion providing explicit spatial guidance for edit sensitivity. A tamper-aware extractor jointly conducts provenance tracing and tamper localization by leveraging watermark features together with high-frequency features. Experiments show that GenPTW maintains high visual fidelity and strong robustness against diverse AIGC-editing.
翻译:生成式图像模型的激增彻底改变了AIGC创作,同时也加剧了人们对内容来源与篡改鉴证的担忧。现有方法通常要么无法定位篡改,要么局限于特定的生成设置,限制了其实用性。我们提出了\textbf{GenPTW},一个在潜在空间统一\textbf{来源追溯}与\textbf{篡改定位}的通用水印框架。它支持生成中嵌入与生成后嵌入,且无需改变生成过程,并能即插即用地兼容潜在扩散模型和视觉自回归模型。为实现精确的来源追溯与篡改定位,我们通过两种互补机制嵌入水印:与潜在语义对齐的交叉注意力融合,以及为编辑敏感性提供显式空间引导的空间融合。一个篡改感知提取器通过联合利用水印特征与高频特征,共同执行来源追溯与篡改定位。实验表明,GenPTW在保持高视觉保真度的同时,对多样化的AIGC编辑具有强鲁棒性。