The extraordinary ability of generative models enabled the generation of images with such high quality that human beings cannot distinguish Artificial Intelligence (AI) generated images from real-life photographs. The development of generation techniques opened up new opportunities but concurrently introduced potential risks to privacy, authenticity, and security. Therefore, the task of detecting AI-generated imagery is of paramount importance to prevent illegal activities. To assess the generalizability and robustness of AI-generated image detection, we present a large-scale dataset, referred to as WildFake, comprising state-of-the-art generators, diverse object categories, and real-world applications. WildFake dataset has the following advantages: 1) Rich Content with Wild collection: WildFake collects fake images from the open-source community, enriching its diversity with a broad range of image classes and image styles. 2) Hierarchical structure: WildFake contains fake images synthesized by different types of generators from GANs, diffusion models, to other generative models. These key strengths enhance the generalization and robustness of detectors trained on WildFake, thereby demonstrating WildFake's considerable relevance and effectiveness for AI-generated detectors in real-world scenarios. Moreover, our extensive evaluation experiments are tailored to yield profound insights into the capabilities of different levels of generative models, a distinctive advantage afforded by WildFake's unique hierarchical structure.
翻译:生成模型凭借其卓越能力,已能生成质量高到人类难以区分人工智能生成图像与真实照片的图像。生成技术的发展开创了新机遇,但同时也给隐私、真实性和安全性带来了潜在风险。因此,检测AI生成图像的任务对于预防非法活动至关重要。为评估AI生成图像检测的泛化性和鲁棒性,我们提出一个名为WildFake的大规模数据集,该数据集包含最先进的生成器、多样化的物体类别及真实世界应用场景。WildFake数据集具有以下优势:1)丰富的野生采集内容:WildFake从开源社区收集虚假图像,通过广泛的图像类别与风格增强多样性;2)层级结构:WildFake包含由生成对抗网络、扩散模型及其他生成模型等不同类型生成器合成的虚假图像。这些核心优势提升了在WildFake上训练的检测器的泛化能力与鲁棒性,从而证明WildFake在真实场景中对AI生成检测器的重要相关性与有效性。此外,我们定制的广泛评估实验为深入理解不同层级生成模型的能力提供了深刻见解——这正是WildFake独特层级结构赋予的显著优势。