The classification of forged videos has been a challenge for the past few years. Deepfake classifiers can now reliably predict whether or not video frames have been tampered with. However, their performance is tied to both the dataset used for training and the analyst's computational power. We propose a deepfake detection method that operates in the latent space of a state-of-the-art generative adversarial network (GAN) trained on high-quality face images. The proposed method leverages the structure of the latent space of StyleGAN to learn a lightweight binary classification model. Experimental results on standard datasets reveal that the proposed approach outperforms other state-of-the-art deepfake classification methods, especially in contexts where the data available to train the models is rare, such as when a new manipulation method is introduced. To the best of our knowledge, this is the first study showing the interest of the latent space of StyleGAN for deepfake classification. Combined with other recent studies on the interpretation and manipulation of this latent space, we believe that the proposed approach can further help in developing frugal deepfake classification methods based on interpretable high-level properties of face images.
翻译:伪造视频的分类在过去几年中一直是一项挑战。深度伪造分类器目前已能可靠预测视频帧是否被篡改。然而,其性能既受限于训练所用数据集,也受分析者计算能力的制约。我们提出一种在基于高质量人脸图像训练的最先进生成对抗网络(GAN)潜在空间中操作的深度伪造检测方法。该方法利用StyleGAN潜在空间的结构来学习轻量级二元分类模型。在标准数据集上的实验结果表明,所提方法优于其他最先进的深度伪造分类方法,尤其在模型训练数据稀缺的场景下(例如引入新型篡改方法时)表现突出。据我们所知,这是首个揭示StyleGAN潜在空间对深度伪造分类价值的研究。结合近期关于该潜在空间解释与操纵的其他研究,我们相信所提方法能进一步推动基于人脸图像可解释高层特征的轻量化深度伪造分类方法的发展。