We study universal deepfake detection. Our goal is to detect synthetic images from a range of generative AI approaches, particularly from emerging ones which are unseen during training of the deepfake detector. Universal deepfake detection requires outstanding generalization capability. Motivated by recently proposed masked image modeling which has demonstrated excellent generalization in self-supervised pre-training, we make the first attempt to explore masked image modeling for universal deepfake detection. We study spatial and frequency domain masking in training deepfake detectors. Based on empirical analysis, we propose a novel deepfake detector via frequency masking. Our focus on frequency domain is different from the majority, which primarily target spatial domain detection. Our comparative analyses reveal substantial performance gains over existing methods. Code and models are publicly available.
翻译:我们研究通用深度伪造检测技术,旨在识别来自各类生成式人工智能方法(特别是深度伪造检测器训练阶段未见的新型方法)所合成的图像。通用深度伪造检测需具备卓越的泛化能力。受近期在自监督预训练中展现出优异泛化性能的掩蔽图像建模启发,我们首次尝试探索将掩蔽图像建模应用于通用深度伪造检测。我们系统研究了训练深度伪造检测器过程中的空间域与频率域掩蔽策略。基于实证分析,我们提出了一种通过频率掩蔽实现的新型深度伪造检测器。与主流的空间域检测方法不同,本研究聚焦于频率域分析。对比分析表明,本方法相较于现有方法具有显著性能优势。相关代码与模型已公开发布。