Cryptographic provenance standards such as C2PA and invisible watermarking are positioned as complementary defenses for content authentication, yet the two verification layers are technically independent: neither conditions on the output of the other. This work formalizes and empirically demonstrates the $\textit{Integrity Clash}$, a condition in which a digital asset carries a cryptographically valid C2PA manifest asserting human authorship while its pixels simultaneously carry a watermark identifying it as AI-generated, with both signals passing their respective verification checks in isolation. We construct metadata washing workflows that produce these authenticated fakes through standard editing pipelines, requiring no cryptographic compromise, only the semantic omission of a single assertion field permitted by the current C2PA specification. To close this gap, we propose a cross-layer audit protocol that jointly evaluates provenance metadata and watermark detection status, achieving 100% classification accuracy across 3,500 test images spanning four conflict-matrix states and three realistic perturbation conditions. Our results demonstrate that the gap between these verification layers is unnecessary and technically straightforward to close.
翻译:加密来源标准(如C2PA)与隐含水印被定位为内容认证的互补防御手段,但这两个验证层在技术上相互独立:任何一方均不以另一方的输出为条件。本文系统化阐述并通过实验证明了“完整性冲突”现象——数字资产携带具备加密有效性的C2PA清单,声称其由人类创作,同时其像素承载识别为AI生成的水印,且两种信号在独立验证时均能通过各自检查。我们构建了元数据清洗工作流,通过标准编辑流程生成这些经过认证的伪造品,该过程无需破坏加密体系,仅需依据现行C2PA规范允许的语义省略单个断言字段即可实现。为弥合这一漏洞,我们提出跨层审计协议,联合评估来源元数据与水印检测状态,在涵盖四种冲突矩阵状态和三种现实扰动条件的3,500张测试图像上实现了100%的分类准确率。研究结果表明,这些验证层之间的间隙不仅毫无必要,而且在技术层面也极易弥合。