Generative models have reached an advanced stage where they can produce remarkably realistic images. However, this remarkable generative capability also introduces the risk of disseminating false or misleading information. Notably, existing image detectors for generated images encounter challenges such as low accuracy and limited generalization. This paper seeks to address this issue by seeking a representation with strong generalization capabilities to enhance the detection of generated images. Our investigation has revealed that real and generated images display distinct latent Gaussian representations when subjected to an inverse diffusion process within a pre-trained diffusion model. Exploiting this disparity, we can amplify subtle artifacts in generated images. Building upon this insight, we introduce a novel image representation known as Diffusion Noise Feature (DNF). DNF is extracted from the estimated noise generated during the inverse diffusion process. A simple classifier, e.g., ResNet50, trained on DNF achieves high accuracy, robustness, and generalization capabilities for detecting generated images (even the corresponding generator is built with datasets/structures that are not seen during the classifier's training). We conducted experiments using four training datasets and five testsets, achieving state-of-the-art detection performance.
翻译:生成模型已发展到能够生成极其逼真图像的高级阶段。然而,这种卓越的生成能力也带来了传播虚假或误导性信息的风险。值得注意的是,现有的生成图像检测器面临准确率低和泛化能力有限等挑战。本文旨在通过寻找具有强泛化能力的表示来提升生成图像的检测效果。我们的研究发现,真实图像与生成图像在预训练扩散模型的反向扩散过程中呈现不同的潜在高斯表示。利用这一差异,我们可以放大生成图像中的细微伪影。基于这一发现,我们提出了一种名为扩散噪声特征(DNF)的新型图像表示。DNF从反向扩散过程中产生的估计噪声中提取。在DNF上训练的简单分类器(例如ResNet50)能够实现对生成图像的高准确率、鲁棒性和泛化检测能力(即使对应的生成器使用了分类器训练中未见过的数据集/结构进行构建)。我们使用四个训练数据集和五个测试集进行了实验,取得了最先进的检测性能。