Generative image models have emerged as a promising technology to produce realistic images. Despite potential benefits, concerns grow about its misuse, particularly in generating deceptive images that could raise significant ethical, legal, and societal issues. Consequently, there is growing demand to empower users to effectively discern and comprehend patterns of AI-generated images. To this end, we developed ASAP, an interactive visualization system that automatically extracts distinct patterns of AI-generated images and allows users to interactively explore them via various views. To uncover fake patterns, ASAP introduces a novel image encoder, adapted from CLIP, which transforms images into compact "distilled" representations, enriched with information for differentiating authentic and fake images. These representations generate gradients that propagate back to the attention maps of CLIP's transformer block. This process quantifies the relative importance of each pixel to image authenticity or fakeness, exposing key deceptive patterns. ASAP enables the at scale interactive analysis of these patterns through multiple, coordinated visualizations. This includes a representation overview with innovative cell glyphs to aid in the exploration and qualitative evaluation of fake patterns across a vast array of images, as well as a pattern view that displays authenticity-indicating patterns in images and quantifies their impact. ASAP supports the analysis of cutting-edge generative models with the latest architectures, including GAN-based models like proGAN and diffusion models like the latent diffusion model. We demonstrate ASAP's usefulness through two usage scenarios using multiple fake image detection benchmark datasets, revealing its ability to identify and understand hidden patterns in AI-generated images, especially in detecting fake human faces produced by diffusion-based techniques.
翻译:生成式图像模型已成为一种有望生成逼真图像的前沿技术。尽管具有潜在益处,但其滥用问题日益引发担忧,特别是在生成可能引发重大伦理、法律及社会问题的欺骗性图像方面。因此,用户亟需能够有效识别并理解AI生成图像模式的能力。为此,我们开发了ASAP——一个交互式可视化系统,可自动提取AI生成图像的独特模式,并允许用户通过多种视图交互式探索这些模式。为揭示伪造模式,ASAP引入了一种基于CLIP改进的新型图像编码器,将图像转化为紧凑的"蒸馏"表征,其中富含用于区分真实与伪造图像的信息。这些表征生成梯度后反向传播至CLIP变换器模块的注意力图,该过程量化了每个像素对图像真实性或伪造性的相对重要性,从而暴露关键欺骗模式。ASAP通过多个协同可视化视图实现对这些模式的大规模交互分析,包括:采用创新性单元字形(cell glyphs)的表征概览视图,以辅助对海量图像中伪造模式的探索与定性评估;以及显示图像中指示真实性模式并量化其影响的模式视图。ASAP支持对采用最新架构的尖端生成模型(包括基于GAN的proGAN模型与基于扩散的隐扩散模型)进行分析。我们通过两个基于多个人脸伪造图像检测基准数据集的使用场景展示了ASAP的实用性,验证了其在识别与理解AI生成图像中隐藏模式的能力,特别是针对扩散技术生成的伪造人脸检测任务。