Deepfake techniques generate highly realistic data, making it challenging for humans to discern between actual and artificially generated images. Recent advancements in deep learning-based deepfake detection methods, particularly with diffusion models, have shown remarkable progress. However, there is a growing demand for real-world applications to detect unseen individuals, deepfake techniques, and scenarios. To address this limitation, we propose a Prototype-based Unified Framework for Deepfake Detection (PUDD). PUDD offers a detection system based on similarity, comparing input data against known prototypes for video classification and identifying potential deepfakes or previously unseen classes by analyzing drops in similarity. Our extensive experiments reveal three key findings: (1) PUDD achieves an accuracy of 95.1% on Celeb-DF, outperforming state-of-the-art deepfake detection methods; (2) PUDD leverages image classification as the upstream task during training, demonstrating promising performance in both image classification and deepfake detection tasks during inference; (3) PUDD requires only 2.7 seconds for retraining on new data and emits 10$^{5}$ times less carbon compared to the state-of-the-art model, making it significantly more environmentally friendly.
翻译:深度伪造技术生成高度逼真的数据,使得人类难以区分真实图像与人工生成图像。基于深度学习的深度伪造检测方法,尤其是扩散模型,近期取得了显著进展。然而,实际应用中对检测未知人物、未知深度伪造技术和未知场景的需求日益增长。为应对这一局限,我们提出一种基于原型的统一深度伪造检测框架(PUDD)。PUDD提供基于相似度的检测系统,通过将输入数据与已知原型进行比对实现视频分类,并通过分析相似度下降来识别潜在深度伪造或先前未知类别。我们的大量实验揭示了三个关键发现:(1)PUDD在Celeb-DF数据集上达到95.1%的准确率,优于当前最先进的深度伪造检测方法;(2)PUDD在训练阶段以图像分类作为上游任务,在推理阶段同时展现出图像分类与深度伪造检测任务的良好性能;(3)PUDD仅需2.7秒即可完成新数据重训练,且碳排放量比当前最优模型低10$^{5}$倍,具有显著的环境友好性。