GAN-generated image detection now becomes the first line of defense against the malicious uses of machine-synthesized image manipulations such as deepfakes. Although some existing detectors work well in detecting clean, known GAN samples, their success is largely attributable to overfitting unstable features such as frequency artifacts, which will cause failures when facing unknown GANs or perturbation attacks. To overcome the issue, we propose a robust detection framework based on a novel multi-view image completion representation. The framework first learns various view-to-image tasks to model the diverse distributions of genuine images. Frequency-irrelevant features can be represented from the distributional discrepancies characterized by the completion models, which are stable, generalized, and robust for detecting unknown fake patterns. Then, a multi-view classification is devised with elaborated intra- and inter-view learning strategies to enhance view-specific feature representation and cross-view feature aggregation, respectively. We evaluated the generalization ability of our framework across six popular GANs at different resolutions and its robustness against a broad range of perturbation attacks. The results confirm our method's improved effectiveness, generalization, and robustness over various baselines.
翻译:GAN生成图像检测现已成为对抗深度伪造等机器合成图像恶意利用的第一道防线。尽管现有检测器在检测清晰、已知的GAN样本时表现良好,但其成功很大程度上归因于对频率伪影等不稳定特征的过拟合,这将导致在面对未知GAN或扰动攻击时失效。为克服该问题,我们提出一种基于新型多视角图像补全表示的鲁棒检测框架。该框架首先学习多种视角到图像的任务,以建模真实图像的多样分布。通过补全模型所表征的分布差异,可提取与频率无关的特征,这些特征对于检测未知伪造模式具有稳定性、泛化性和鲁棒性。随后,我们设计了一种多视角分类机制,并辅以精细化的视角内与视角间学习策略,分别增强视角特定特征表示和跨视角特征聚合。我们在六种不同分辨率的主流GAN上评估了框架的泛化能力,并测试了其对广泛扰动攻击的鲁棒性。实验结果证实,我们的方法在效果、泛化性和鲁棒性上均优于各类基线方法。