Deep generative models can create remarkably photorealistic fake images while raising concerns about misinformation and copyright infringement, known as deepfake threats. Deepfake detection technique is developed to distinguish between real and fake images, where the existing methods typically learn classifiers in the image domain or various feature domains. However, the generalizability of deepfake detection against emerging and more advanced generative models remains challenging. In this paper, being inspired by the zero-shot advantages of Vision-Language Models (VLMs), we propose a novel approach using VLMs (e.g. InstructBLIP) and prompt tuning techniques to improve the deepfake detection accuracy over unseen data. We formulate deepfake detection as a visual question answering problem, and tune soft prompts for InstructBLIP to answer the real/fake information of a query image. We conduct full-spectrum experiments on datasets from 3 held-in and 13 held-out generative models, covering modern text-to-image generation, image editing and image attacks. Results demonstrate that (1) the deepfake detection accuracy can be significantly and consistently improved (from 58.8% to 91.31%, in average accuracy over unseen data) using pretrained vision-language models with prompt tuning; (2) our superior performance is at less cost of trainable parameters, resulting in an effective and efficient solution for deepfake detection. Code and models can be found at https://github.com/nctu-eva-lab/AntifakePrompt.
翻译:深度生成模型能够生成具有惊人真实感的伪造图像,同时引发了关于虚假信息和版权侵犯的担忧,即深度伪造威胁。深度伪造检测技术旨在区分真实图像与伪造图像,现有方法通常在图像域或各种特征域中学习分类器。然而,针对新兴及更高级生成模型的深度伪造检测泛化能力仍面临挑战。受视觉语言模型零样本优势的启发,本文提出了一种利用视觉语言模型(如InstructBLIP)和提示调优技术提升未见数据上深度伪造检测准确率的新型方法。我们将深度伪造检测构建为视觉问答问题,并为InstructBLIP调优软提示以回答查询图像的真伪信息。我们在涵盖3个已知生成模型和13个未知生成模型的数据集上进行了全谱实验,涉及现代文本到图像生成、图像编辑及图像攻击。结果表明:(1)通过使用预训练视觉语言模型和提示调优,深度伪造检测准确率(未见数据平均准确率)可从58.8%显著且稳定提升至91.31%;(2)我们的优异性能以更少的可训练参数为代价,从而为深度伪造检测提供了一种高效且经济的解决方案。代码和模型可在https://github.com/nctu-eva-lab/AntifakePrompt获取。