The rise of deepfake images, especially of well-known personalities, poses a serious threat to the dissemination of authentic information. To tackle this, we present a thorough investigation into how deepfakes are produced and how they can be identified. The cornerstone of our research is a rich collection of artificial celebrity faces, titled DeepFakeFace (DFF). We crafted the DFF dataset using advanced diffusion models and have shared it with the community through online platforms. This data serves as a robust foundation to train and test algorithms designed to spot deepfakes. We carried out a thorough review of the DFF dataset and suggest two evaluation methods to gauge the strength and adaptability of deepfake recognition tools. The first method tests whether an algorithm trained on one type of fake images can recognize those produced by other methods. The second evaluates the algorithm's performance with imperfect images, like those that are blurry, of low quality, or compressed. Given varied results across deepfake methods and image changes, our findings stress the need for better deepfake detectors. Our DFF dataset and tests aim to boost the development of more effective tools against deepfakes.
翻译:深度伪造图像(尤其是针对知名人物的伪造)的兴起,对真实信息的传播构成了严重威胁。为应对这一问题,我们系统研究了深度伪造图像的生成与识别方法。研究核心是基于先进扩散模型构建的丰富人工名人面部数据集——DeepFakeFace(DFF)。该数据集已通过在线平台向学界开放共享,为训练和测试深度伪造检测算法提供了坚实的数据基础。我们对DFF数据集进行了全面评估,并提出两种评估方法来衡量深度伪造识别工具的鲁棒性与适应性:第一种方法测试算法能否将通过某类伪造图像训练的模型推广到其他类型的伪造图像检测;第二种方法评估算法在模糊、低质量、压缩等非理想图像条件下的性能表现。由于不同伪造方法与图像变形场景下的检测效果存在显著差异,研究结果强调亟需改进深度伪造检测器。我们的DFF数据集与评估体系旨在推动更高效反深度伪造工具的研发。