Evaluating Text-to-Image Generative Models: An Empirical Study on Human Image Synthesis

In this paper, we present an empirical study introducing a nuanced evaluation framework for text-to-image (T2I) generative models, applied to human image synthesis. Our framework categorizes evaluations into two distinct groups: first, focusing on image qualities such as aesthetics and realism, and second, examining text conditions through concept coverage and fairness. We introduce an innovative aesthetic score prediction model that assesses the visual appeal of generated images and unveils the first dataset marked with low-quality regions in generated human images to facilitate automatic defect detection. Our exploration into concept coverage probes the model's effectiveness in interpreting and rendering text-based concepts accurately, while our analysis of fairness reveals biases in model outputs, with an emphasis on gender, race, and age. While our study is grounded in human imagery, this dual-faceted approach is designed with the flexibility to be applicable to other forms of image generation, enhancing our understanding of generative models and paving the way to the next generation of more sophisticated, contextually aware, and ethically attuned generative models. Code and data, including the dataset annotated with defective areas, are available at \href{https://github.com/cure-lab/EvaluateAIGC}{https://github.com/cure-lab/EvaluateAIGC}.

翻译：本文提出了一项实证研究，引入了一个针对文本到图像生成模型在人体图像合成任务中的细致评估框架。该框架将评估分为两个不同的类别：第一类聚焦于图像质量，如美学性和真实感；第二类通过概念覆盖度和公平性来考察文本条件。我们引入了一种创新的美学评分预测模型，用于评估生成图像的视觉吸引力，并发布了首个标注了生成人体图像中低质量区域的数据集，以促进自动缺陷检测。我们对概念覆盖度的探究旨在检验模型准确解释和渲染基于文本概念的有效性，而对公平性的分析则揭示了模型输出中存在的偏见，重点关注性别、种族和年龄。虽然本研究以人体图像为基础，但这种双管齐下的方法设计灵活，可适用于其他形式的图像生成，从而增进对生成模型的理解，并为开发下一代更复杂、更具情境感知能力且更符合伦理的生成模型铺平道路。代码和数据，包括标注缺陷区域的数据集，可在 \href{https://github.com/cure-lab/EvaluateAIGC}{https://github.com/cure-lab/EvaluateAIGC} 获取。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日