The rapid advancement of generative AI has raised concerns about the authenticity of digital images, as highly realistic fake images can now be generated at low cost, potentially increasing societal risks. In response, several datasets have been established to train detection models aimed at distinguishing AI-generated images from real ones. However, existing datasets suffer from limited generalization, low image quality, overly simple prompts, and insufficient image diversity. To address these limitations, we propose a high-quality, large-scale dataset comprising over 730,000 images across multiple categories, including both real and AI-generated images. The generated images are synthesized via state-of-the-art methods, including text-to-image generation (guided by over 10,000 carefully designed prompts), image inpainting, image refinement, and face swapping. Each generated image is annotated with its generation method and category. Inpainting images further include binary masks to indicate inpainted regions, providing rich metadata for analysis. Compared to existing datasets, detection models trained on our dataset demonstrate superior generalization capabilities. Our dataset not only serves as a strong benchmark for evaluating detection methods but also contributes to advancing the robustness of AI-generated image detection techniques. Building upon this, we propose a lightweight detection method based on image noise entropy, which transforms the original image into an entropy tensor of Non-Local Means (NLM) noise before classification. Extensive experiments demonstrate that models trained on our dataset achieve strong generalization, and our method delivers competitive performance, establishing a solid baseline for future research. The dataset and source code are publicly available at https://real-hd.github.io.
翻译:生成式AI的快速发展引发了人们对数字图像真实性的担忧,因为高度逼真的伪造图像如今可以低成本生成,可能增加社会风险。为此,已建立多个数据集用于训练旨在区分AI生成图像与真实图像的检测模型。然而,现有数据集普遍存在泛化能力有限、图像质量较低、提示词过于简单以及图像多样性不足等问题。为应对这些局限,我们提出了一个高质量、大规模的数据集,包含超过73万张涵盖多个类别的图像,其中既有真实图像也有AI生成图像。生成图像通过最先进的方法合成,包括文本到图像生成(由超过1万条精心设计的提示词引导)、图像修复、图像精修以及人脸替换。每张生成图像均标注了其生成方法和类别。修复图像还包含用于指示修复区域的二值掩码,为分析提供了丰富的元数据。与现有数据集相比,基于本数据集训练的检测模型展现出更优的泛化能力。我们的数据集不仅可作为评估检测方法的强基准,也有助于提升AI生成图像检测技术的鲁棒性。在此基础上,我们提出了一种基于图像噪声熵的轻量级检测方法,该方法在分类前将原始图像转换为非局部均值(NLM)噪声的熵张量。大量实验表明,基于本数据集训练的模型实现了强大的泛化性能,且我们的方法提供了具有竞争力的检测效果,为未来研究奠定了坚实基础。数据集与源代码已公开于https://real-hd.github.io。