Deep generative models can create remarkably photorealistic fake images while raising concerns about misinformation and copyright infringement, known as deepfake threats. Deepfake detection technique is developed to distinguish between real and fake images, where the existing methods typically train classifiers in the image domain or various feature domains. However, the generalizability of deepfake detection against emerging and more advanced generative models remains challenging. In this paper, inspired by the zero-shot advantages of Vision-Language Models (VLMs), we propose a novel approach using VLMs (e.g. InstructBLIP) and prompt tuning techniques to improve the deepfake detection accuracy over unseen data. We formulate deepfake detection as a visual question answering problem, and tune soft prompts for InstructBLIP to distinguish a query image is real or fake. We conduct full-spectrum experiments on datasets from 3 held-in and 13 held-out generative models, covering modern text-to-image generation, image editing and image attacks. Results demonstrate that (1) the deepfake detection accuracy can be significantly and consistently improved (from 54.6% to 91.31%, in average accuracy over unseen data) using pretrained vision-language models with prompt tuning; (2) our superior performance is at less cost of trainable parameters, resulting in an effective and efficient solution for deepfake detection. Code and models can be found at https://github.com/nctu-eva-lab/AntifakePrompt.
翻译:深度生成模型能够创造极其逼真的假图像,同时引发了对虚假信息和版权侵权的担忧,即深度伪造威胁。深度伪造检测技术旨在区分真实图像与伪造图像,现有方法通常在图像域或各种特征域中训练分类器。然而,针对新兴及更先进的生成模型,深度伪造检测的泛化能力仍然面临挑战。本文受视觉-语言模型(VLM)零样本优势的启发,提出了一种利用VLM(如InstructBLIP)和提示调优技术提升未知数据上深度伪造检测准确率的新方法。我们将深度伪造检测构建为视觉问答问题,并为InstructBLIP调整软提示,以区分查询图像的真实性。我们在来自3个保留内和13个保留外生成模型的数据集上进行了全频谱实验,涵盖现代文本到图像生成、图像编辑及图像攻击。结果表明:(1)通过使用预训练视觉-语言模型结合提示调优,深度伪造检测准确率可显著且一致地提升(未知数据平均准确率从54.6%提升至91.31%);(2)我们的优异性能以更少的可训练参数为代价,实现了高效且有效的深度伪造检测解决方案。代码和模型可参见 https://github.com/nctu-eva-lab/AntifakePrompt。