Photos serve as a way for humans to record what they experience in their daily lives, and they are often regarded as trustworthy sources of information. However, there is a growing concern that the advancement of artificial intelligence (AI) technology may produce fake photos, which can create confusion and diminish trust in photographs. This study aims to comprehensively evaluate agents for distinguishing state-of-the-art AI-generated visual content. Our study benchmarks both human capability and cutting-edge fake image detection AI algorithms, using a newly collected large-scale fake image dataset Fake2M. In our human perception evaluation, titled HPBench, we discovered that humans struggle significantly to distinguish real photos from AI-generated ones, with a misclassification rate of 38.7%. Along with this, we conduct the model capability of AI-Generated images detection evaluation MPBench and the top-performing model from MPBench achieves a 13% failure rate under the same setting used in the human evaluation. We hope that our study can raise awareness of the potential risks of AI-generated images and facilitate further research to prevent the spread of false information. More information can refer to https://github.com/Inf-imagine/Sentry.
翻译:照片作为人类记录日常经历的方式,常被视为值得信赖的信息来源。然而,随着人工智能技术的进步,人们日益担忧其可能制造虚假照片,从而引发混乱并削弱人们对照片的信任感。本研究旨在全面评估识别最先进AI生成视觉内容的智能体。我们利用新构建的大规模虚假图像数据集Fake2M,对人类的感知能力与前沿的虚假图像检测AI算法进行了基准测试。在人类感知评估(名为HPBench)中,我们发现人类在区分真实照片与AI生成图像方面存在显著困难,误分类率高达38.7%。与此同时,我们通过AI生成图像检测模型能力评估(MPBench)发现,在人类评估的相同设定下,表现最佳的模型失败率为13%。我们期望这项研究能提高人们对AI生成图像潜在风险的认知,并推动相关研究以防止虚假信息传播。更多信息请参阅https://github.com/Inf-imagine/Sentry。