Diffusion models (DMs) have revolutionized image generation, producing high-quality images with applications spanning various fields. However, their ability to create hyper-realistic images poses significant challenges in distinguishing between real and synthetic content, raising concerns about digital authenticity and potential misuse in creating deepfakes. This work introduces a robust detection framework that integrates image and text features extracted by CLIP model with a Multilayer Perceptron (MLP) classifier. We propose a novel loss that can improve the detector's robustness and handle imbalanced datasets. Additionally, we flatten the loss landscape during the model training to improve the detector's generalization capabilities. The effectiveness of our method, which outperforms traditional detection techniques, is demonstrated through extensive experiments, underscoring its potential to set a new state-of-the-art approach in DM-generated image detection. The code is available at https://github.com/Purdue-M2/Robust_DM_Generated_Image_Detection.
翻译:扩散模型(DMs)革新了图像生成领域,能够生成高质量图像并广泛应用于多个领域。然而,其生成超逼真图像的能力在区分真实与合成内容方面带来了重大挑战,引发了人们对数字真实性及深度伪造潜在滥用的担忧。本研究提出一种鲁棒检测框架,该框架将CLIP模型提取的图像和文本特征与多层感知机(MLP)分类器相结合。我们设计了一种新型损失函数,既能提升检测器的鲁棒性,又能处理不平衡数据集。此外,在模型训练过程中,我们通过平坦化损失曲面来增强检测器的泛化能力。大量实验表明,该方法优于传统检测技术,有望在扩散模型生成图像检测领域树立新的最优方法。代码已开源:https://github.com/Purdue-M2/Robust_DM_Generated_Image_Detection。