One of the key challenges of detecting AI-generated images is spotting images that have been created by previously unseen generative models. We argue that the limited diversity of the training data is a major obstacle to addressing this problem, and we propose a new dataset that is significantly larger and more diverse than prior work. As part of creating this dataset, we systematically download thousands of text-to-image latent diffusion models and sample images from them. We also collect images from dozens of popular open source and commercial models. The resulting dataset contains 2.7M images that have been sampled from 4803 different models. These images collectively capture a wide range of scene content, generator architectures, and image processing settings. Using this dataset, we study the generalization abilities of fake image detectors. Our experiments suggest that detection performance improves as the number of models in the training set increases, even when these models have similar architectures. We also find that detection performance improves as the diversity of the models increases, and that our trained detectors generalize better than those trained on other datasets.
翻译:检测AI生成图像的关键挑战之一在于识别由先前未见过的生成模型所创建的图像。我们认为训练数据多样性有限是解决该问题的主要障碍,为此我们提出了一个在规模与多样性上均显著超越先前工作的新数据集。在构建该数据集的过程中,我们系统性地下载了数千个文本到图像的潜在扩散模型,并从中采样生成图像。同时,我们还收集了数十个主流开源及商业模型生成的图像。最终数据集包含从4803个不同模型中采样的270万张图像,这些图像共同涵盖了广泛的场景内容、生成器架构及图像处理设置。基于该数据集,我们研究了虚假图像检测器的泛化能力。实验表明,即使训练集中的模型架构相似,检测性能仍会随着模型数量的增加而提升。我们还发现检测性能随模型多样性的增加而改善,且我们训练的检测器相比基于其他数据集训练的检测器具有更好的泛化能力。