Facial forgery detection is a crucial but extremely challenging topic, with the fast development of forgery techniques making the synthetic artefact highly indistinguishable. Prior works show that by mining both spatial and frequency information the forgery detection performance of deep learning models can be vastly improved. However, leveraging multiple types of information usually requires more than one branch in the neural network, which makes the model heavy and cumbersome. Knowledge distillation, as an important technique for efficient modelling, could be a possible remedy. We find that existing knowledge distillation methods have difficulties distilling a dual-branch model into a single-branch model. More specifically, knowledge distillation on both the spatial and frequency branches has degraded performance than distillation only on the spatial branch. To handle such problem, we propose a novel two-in-one knowledge distillation framework which can smoothly merge the information from a large dual-branch network into a small single-branch network, with the help of different dedicated feature projectors and the gradient homogenization technique. Experimental analysis on two datasets, FaceForensics++ and Celeb-DF, shows that our proposed framework achieves superior performance for facial forgery detection with much fewer parameters.
翻译:人脸伪造检测是至关重要但极具挑战性的课题,随着伪造技术的快速发展,合成伪迹已变得高度难以分辨。先前研究表明,通过同时挖掘空间和频率信息,可以显著提升深度学习模型的伪造检测性能。然而,利用多种类型信息通常需要神经网络中包含多个分支,这导致模型臃肿且计算开销大。知识蒸馏作为高效建模的重要技术,可能为此提供解决方案。我们发现现有知识蒸馏方法难以将双分支模型有效蒸馏为单分支模型。具体而言,同时蒸馏空间分支和频率分支的性能反而低于仅蒸馏空间分支。针对该问题,我们提出一种新颖的二合一知识蒸馏框架,通过专用特征投影器与梯度均匀化技术,能够将大型双分支网络的信息平滑地融合进小型单分支网络。在FaceForensics++和Celeb-DF两个数据集上的实验表明,所提框架能以更少的参数量实现卓越的人脸伪造检测性能。