In Anticipation of Perfect Deepfake: Identity-anchored Artifact-agnostic Detection under Rebalanced Deepfake Detection Protocol

As deep generative models advance, we anticipate deepfakes achieving "perfection"-generating no discernible artifacts or noise. However, current deepfake detectors, intentionally or inadvertently, rely on such artifacts for detection, as they are exclusive to deepfakes and absent in genuine examples. To bridge this gap, we introduce the Rebalanced Deepfake Detection Protocol (RDDP) to stress-test detectors under balanced scenarios where genuine and forged examples bear similar artifacts. We offer two RDDP variants: RDDP-WHITEHAT uses white-hat deepfake algorithms to create 'self-deepfakes,' genuine portrait videos with the resemblance of the underlying identity, yet carry similar artifacts to deepfake videos; RDDP-SURROGATE employs surrogate functions (e.g., Gaussian noise) to process both genuine and forged examples, introducing equivalent noise, thereby sidestepping the need of deepfake algorithms. Towards detecting perfect deepfake videos that aligns with genuine ones, we present ID-Miner, a detector that identifies the puppeteer behind the disguise by focusing on motion over artifacts or appearances. As an identity-based detector, it authenticates videos by comparing them with reference footage. Equipped with the artifact-agnostic loss at frame-level and the identity-anchored loss at video-level, ID-Miner effectively singles out identity signals amidst distracting variations. Extensive experiments comparing ID-Miner with 12 baseline detectors under both conventional and RDDP evaluations with two deepfake datasets, along with additional qualitative studies, affirm the superiority of our method and the necessity for detectors designed to counter perfect deepfakes.

翻译：随着深度生成模型的进步，我们预见到深度伪造技术将实现“完美”——生成无任何可辨伪影或噪声的影像。然而，当前深度伪造检测器不论有意或无意，均依赖此类伪影进行检测，因为这些伪影仅存在于伪造样本中，真实样本则完全缺失。为填补这一空白，我们提出重均衡深度伪造检测协议（RDDP），该协议在真实样本与伪造样本具有相似伪影的均衡场景下，对检测器进行压力测试。我们提供两种RDDP变体：RDDP-WHITEHAT采用白帽深度伪造算法生成“自深度伪造”——即保留底层身份特征的真实人像视频，同时携带与深度伪造视频相似的伪影；RDDP-SURROGATE则运用代理函数（如高斯噪声）对真实与伪造样本进行同等处理，引入等价噪声，从而规避对深度伪造算法的依赖。为检测与真实视频高度一致的完美深度伪造，我们提出ID-Miner检测器——该检测器通过聚焦运动轨迹而非伪影或外观特征，识别伪装背后的操控者。作为基于身份的检测器，其通过将待验证视频与参考影像进行比对完成认证。通过在帧级引入伪影不可知损失函数与视频级引入身份锚定损失函数，ID-Miner能有效从干扰性变异中分离出身份信号。在两个深度伪造数据集上，综合运用传统评估与RDDP评估，将ID-Miner与12种基线检测器进行对比实验，并辅以额外的定性研究，充分验证了我们方法的优越性，以及针对完美深度伪造设计专用检测器的必要性。