Recent progress in generative artificial intelligence (gen-AI) has enabled the generation of photo-realistic and artistically-inspiring photos at a single click, catering to millions of users online. To explore how people use gen-AI models such as DALLE and StableDiffusion, it is critical to understand the themes, contents, and variations present in the AI-generated photos. In this work, we introduce TWIGMA (TWItter Generative-ai images with MetadatA), a comprehensive dataset encompassing over 800,000 gen-AI images collected from Jan 2021 to March 2023 on Twitter, with associated metadata (e.g., tweet text, creation date, number of likes), available at https://zenodo.org/records/8031785. Through a comparative analysis of TWIGMA with natural images and human artwork, we find that gen-AI images possess distinctive characteristics and exhibit, on average, lower variability when compared to their non-gen-AI counterparts. Additionally, we find that the similarity between a gen-AI image and natural images is inversely correlated with the number of likes. Finally, we observe a longitudinal shift in the themes of AI-generated images on Twitter, with users increasingly sharing artistically sophisticated content such as intricate human portraits, whereas their interest in simple subjects such as natural scenes and animals has decreased. Our findings underscore the significance of TWIGMA as a unique data resource for studying AI-generated images.
翻译:近年来,生成式人工智能(gen-AI)的进展使得人们能够一键生成逼真且富有艺术感的照片,服务于数百万在线用户。为探究DALL-E和StableDiffusion等生成式AI模型的使用情况,理解AI生成图像的主题、内容及变化特征至关重要。本研究提出了TWIGMA(TWItter Generative-ai images with MetadatA)数据集,包含自2021年1月至2023年3月从推特收集的80余万张生成式AI图像及其关联元数据(如推文文本、创建日期、点赞数),数据获取地址为https://zenodo.org/records/8031785。通过将TWIGMA与自然图像及人工艺术作品进行对比分析,我们发现生成式AI图像具有独特特征,且其平均变异性低于非生成式AI图像。此外,研究显示生成式AI图像与自然图像的相似度与点赞数呈负相关。最后,我们观察到推特上AI生成图像主题的纵向迁移:用户愈加倾向于分享复杂人像等艺术化内容,而对自然景观和动物等简单主题的兴趣逐渐减弱。本研究揭示了TWIGMA作为研究AI生成图像独特数据资源的重要价值。