Detecting deepfakes has become increasingly challenging as forgery faces synthesized by AI-generated methods, particularly diffusion models, achieve unprecedented quality and resolution. Existing forgery detection approaches relying on spatial and frequency features demonstrate limited efficacy against high-quality, entirely synthesized forgeries. In this paper, we propose a novel detection method grounded in the observation that facial attributes governed by complex physical laws and multiple parameters are inherently difficult to replicate. Specifically, we focus on illumination, particularly the specular reflection component in the Phong illumination model, which poses the greatest replication challenge due to its parametric complexity and nonlinear formulation. We introduce a fast and accurate face texture estimation method based on Retinex theory to enable precise specular reflection separation. Furthermore, drawing from the mathematical formulation of specular reflection, we posit that forgery evidence manifests not only in the specular reflection itself but also in its relationship with corresponding face texture and direct light. To address this issue, we design the Specular-Reflection-Inconsistency-Network (SRI-Net), incorporating a two-stage cross-attention mechanism to capture these correlations and integrate specular reflection related features with image features for robust forgery detection. Experimental results demonstrate that our method achieves superior performance on both traditional deepfake datasets and generative deepfake datasets, particularly those containing diffusion-generated forgery faces.
翻译:随着AI生成方法(特别是扩散模型)合成的人脸伪造图像在质量和分辨率上达到前所未有的水平,深度伪造检测变得日益困难。现有依赖空间和频域特征的伪造检测方法在面对高质量、完全合成的伪造内容时效果有限。本文提出一种新颖的检测方法,其理论基础在于:受复杂物理定律和多重参数控制的面部属性本质上难以被准确复现。具体而言,我们聚焦于光照条件,特别是Phong光照模型中的镜面反射分量——由于其参数复杂性和非线性表达形式,该分量构成了最大的复现挑战。我们基于Retinex理论提出了一种快速准确的人脸纹理估计方法,以实现精确的镜面反射分离。此外,根据镜面反射的数学表达形式,我们提出伪造痕迹不仅存在于镜面反射本身,更体现在其与对应人脸纹理及直射光的关联关系中。为解决该问题,我们设计了镜面反射不一致性网络(SRI-Net),通过两阶段交叉注意力机制捕捉这些关联,并将镜面反射相关特征与图像特征融合以实现鲁棒的伪造检测。实验结果表明,我们的方法在传统深度伪造数据集和生成式深度伪造数据集(特别是包含扩散模型生成伪造人脸的数据库)上均取得了优越性能。