This paper addresses the generalization issue in deepfake detection by harnessing forgery quality in training data. Generally, the forgery quality of different deepfakes varies: some have easily recognizable forgery clues, while others are highly realistic. Existing works often train detectors on a mix of deepfakes with varying forgery qualities, potentially leading detectors to short-cut the easy-to-spot artifacts from low-quality forgery samples, thereby hurting generalization performance. To tackle this issue, we propose a novel quality-centric framework for generic deepfake detection, which is composed of a Quality Evaluator, a low-quality data enhancement module, and a learning pacing strategy that explicitly incorporates forgery quality into the training process. The framework is inspired by curriculum learning, which is designed to gradually enable the detector to learn more challenging deepfake samples, starting with easier samples and progressing to more realistic ones. We employ both static and dynamic assessments to assess the forgery quality, combining their scores to produce a final rating for each training sample. The rating score guides the selection of deepfake samples for training, with higher-rated samples having a higher probability of being chosen. Furthermore, we propose a novel frequency data augmentation method specifically designed for low-quality forgery samples, which helps to reduce obvious forgery traces and improve their overall realism. Extensive experiments show that our method can be applied in a plug-and-play manner and significantly enhance the generalization performance.
翻译:本文通过利用训练数据中的伪造质量来解决深度伪造检测中的泛化问题。通常,不同深度伪造的伪造质量存在差异:一些具有易于识别的伪造线索,而另一些则高度逼真。现有工作通常在具有不同伪造质量的深度伪造混合数据上训练检测器,这可能导致检测器走捷径,仅学习低质量伪造样本中易于发现的伪影,从而损害泛化性能。为解决此问题,我们提出了一种新颖的面向通用深度伪造检测的质量中心框架,该框架由质量评估器、低质量数据增强模块以及将伪造质量明确纳入训练过程的学习节奏策略组成。该框架受到课程学习的启发,旨在使检测器从较简单的样本开始,逐步学习更具挑战性的深度伪造样本,最终能够处理更逼真的伪造内容。我们采用静态和动态评估相结合的方法来评估伪造质量,并将其分数结合为每个训练样本的最终评级。该评级分数指导训练中深度伪造样本的选择,评级较高的样本被选中的概率更高。此外,我们提出了一种专门针对低质量伪造样本设计的新型频域数据增强方法,该方法有助于减少明显的伪造痕迹并提升其整体逼真度。大量实验表明,我们的方法可以以即插即用的方式应用,并显著提升泛化性能。