We study universal deepfake detection. Our goal is to detect synthetic images from a range of generative AI approaches, particularly from emerging ones which are unseen during training of the deepfake detector. Universal deepfake detection requires outstanding generalization capability. Motivated by recently proposed masked image modeling which has demonstrated excellent generalization in self-supervised pre-training, we make the first attempt to explore masked image modeling for universal deepfake detection. We study spatial and frequency domain masking in training deepfake detectors. Based on empirical analysis, we propose a novel deepfake detector via frequency masking. Our focus on frequency domain is different from the majority, which primarily target spatial domain detection. Our comparative analyses reveal substantial performance gains over existing methods. Code and models are publicly available.
翻译:我们研究通用深度伪造检测问题,目标是从各类生成式人工智能方法中检测合成图像,特别是针对深度伪造检测器训练阶段未见过的新兴生成方法。通用深度伪造检测需要卓越的泛化能力。受近期提出的掩蔽图像建模在自监督预训练中展现优异泛化性能的启发,我们首次尝试探索将掩蔽图像建模应用于通用深度伪造检测。我们研究了在训练深度伪造检测器时采用空间域与频率域掩蔽的技术路径。基于实证分析,我们提出了一种通过频率掩蔽实现的新型深度伪造检测器。与主要聚焦空间域检测的现有研究不同,本工作重点关注频率域分析。对比分析表明,我们的方法相较于现有方法具有显著性能提升。相关代码与模型已公开提供。