The ability of image and video generation models to create photorealistic images has reached unprecedented heights, making it difficult to distinguish between real and fake images in many cases. However, despite this progress, a gap remains between the quality of generated images and those found in the real world. To address this, we have reviewed a vast body of literature from both academic publications and social media to identify qualitative shortcomings in image generation models, which we have classified into five categories. By understanding these failures, we can identify areas where these models need improvement, as well as develop strategies for detecting deep fakes. The prevalence of deep fakes in today's society is a serious concern, and our findings can help mitigate their negative impact.
翻译:图像和视频生成模型在创建逼真图像方面的能力已达到前所未有的高度,使得在许多情况下难以区分真实图像与伪造图像。然而,尽管取得了这些进展,生成图像的质量与真实世界中的图像之间仍存在差距。为解决这一问题,我们系统梳理了来自学术出版物和社交媒体的海量文献,归纳出图像生成模型在质性层面的五类缺陷。通过理解这些缺陷,我们既能识别模型需要改进的方向,也能开发检测深度伪造的策略。深度伪造在当今社会的泛滥已成为严峻问题,本研究成果有助于减轻其负面影响。