The classification of forged videos has been a challenge for the past few years. Deepfake classifiers can now reliably predict whether or not video frames have been tampered with. However, their performance is tied to both the dataset used for training and the analyst's computational power. We propose a deepfake detection method that operates in the latent space of a state-of-the-art generative adversarial network (GAN) trained on high-quality face images. The proposed method leverages the structure of the latent space of StyleGAN to learn a lightweight binary classification model. Experimental results on standard datasets reveal that the proposed approach outperforms other state-of-the-art deepfake classification methods, especially in contexts where the data available to train the models is rare, such as when a new manipulation method is introduced. To the best of our knowledge, this is the first study showing the interest of the latent space of StyleGAN for deepfake classification. Combined with other recent studies on the interpretation and manipulation of this latent space, we believe that the proposed approach can further help in developing frugal deepfake classification methods based on interpretable high-level properties of face images.
翻译:过去几年中,伪造视频的分类一直是一个挑战。深度伪造分类器现在能够可靠地预测视频帧是否被篡改。然而,它们的性能既与训练数据集相关,也依赖于分析者的计算能力。我们提出了一种深度伪造检测方法,该方法在一个基于高质量人脸图像训练的最先进生成对抗网络(GAN)的潜在空间中运行。所提出的方法利用StyleGAN潜在空间的结构来学习一个轻量级的二分类模型。在标准数据集上的实验结果表明,该方法优于其他最先进的深度伪造分类方法,尤其是在训练模型可用数据稀缺的背景下(例如新篡改方法出现时)。据我们所知,这是首次展示StyleGAN潜在空间对深度伪造分类意义的研究。结合近期关于该潜在空间解释和操控的其他研究,我们相信所提出的方法能够进一步助力开发基于人脸图像可解释高层属性的节俭式深度伪造分类方法。