We study universal deepfake detection. Our goal is to detect synthetic images from a range of generative AI approaches, particularly from emerging ones which are unseen during training of the deepfake detector. Universal deepfake detection requires outstanding generalization capability. Motivated by recently proposed masked image modeling which has demonstrated excellent generalization in self-supervised pre-training, we make the first attempt to explore masked image modeling for universal deepfake detection. We study spatial and frequency domain masking in training deepfake detectors. Based on empirical analysis, we propose a novel deepfake detector via frequency masking. Our focus on frequency domain is different from the majority, which primarily target spatial domain detection. Our comparative analyses reveal substantial performance gains over existing methods. Code and models are publicly available.
翻译:我们研究通用深度伪造检测,目标是从各类生成式AI方法中检测合成图像,尤其是针对深度伪造检测器训练过程中未见的新兴生成方法。通用深度伪造检测要求出色的泛化能力。受近期在自监督预训练中展现卓越泛化性能的掩蔽图像建模启发,我们首次探索将掩蔽图像建模应用于通用深度伪造检测。我们在训练深度伪造检测器时研究了空间域和频域掩蔽,基于实证分析提出通过频域掩蔽实现的新型深度伪造检测器。与主要聚焦空间域检测的主流方法不同,我们专注于频域研究。对比分析表明,本方法较现有方案具有显著性能提升。代码与模型已公开提供。