The tremendous recent advances in generative artificial intelligence techniques have led to significant successes and promise in a wide range of different applications ranging from conversational agents and textual content generation to voice and visual synthesis. Amid the rise in generative AI and its increasing widespread adoption, there has been significant growing concern over the use of generative AI for malicious purposes. In the realm of visual content synthesis using generative AI, key areas of significant concern has been image forgery (e.g., generation of images containing or derived from copyright content), and data poisoning (i.e., generation of adversarially contaminated images). Motivated to address these key concerns to encourage responsible generative AI, we introduce the DeepfakeArt Challenge, a large-scale challenge benchmark dataset designed specifically to aid in the building of machine learning algorithms for generative AI art forgery and data poisoning detection. Comprising of over 32,000 records across a variety of generative forgery and data poisoning techniques, each entry consists of a pair of images that are either forgeries / adversarially contaminated or not. Each of the generated images in the DeepfakeArt Challenge benchmark dataset has been quality checked in a comprehensive manner. The DeepfakeArt Challenge is a core part of GenAI4Good, a global open source initiative for accelerating machine learning for promoting responsible creation and deployment of generative AI for good.
翻译:生成式人工智能技术的近期重大进展已在从对话代理、文本内容生成到语音及视觉合成等广泛领域取得显著成功与广阔前景。随着生成式AI的兴起及其日益广泛的应用,人们对其被用于恶意目的的担忧显著加剧。在利用生成式AI进行视觉内容合成领域,主要关注点集中于图像伪造(例如,生成包含或衍生自版权内容的图像)与数据投毒(即生成对抗性污染图像)。为应对这些关键问题以促进负责任的生成式AI发展,我们推出了DeepfakeArt Challenge——一个专为辅助构建用于生成式AI艺术伪造与数据投毒检测的机器学习算法而设计的大规模挑战基准数据集。该数据集包含超过32,000条记录,涵盖多种生成式伪造与数据投毒技术,每条记录由一对图像组成,这些图像可能为伪造/对抗性污染图像或非此类图像。DeepfakeArt Challenge基准数据集中的每张生成图像均经过全面质量检查。该挑战是GenAI4Good(一个致力于加速机器学习、推动负责任的生成式AI开发与部署以造福社会的全球开源计划)的核心组成部分。