The Algorithmic Gaze of Image Quality Assessment: An Audit and Trace Ethnography of the LAION-Aesthetics Predictor

Visual generative AI models are trained using a one-size-fits-all measure of aesthetic appeal. However, what is deemed "aesthetic" is inextricably linked to personal taste and cultural values, raising the question of whose taste is represented in visual generative AI models. In this work, we study an aesthetic evaluation model--LAION-Aesthetics Predictor (LAP)--that is widely used to curate datasets to train visual generative image models, like Stable Diffusion, and evaluate the quality of AI-generated images. To understand what LAP measures, we audited the model across three datasets. First, we examined the impact of aesthetic filtering on the LAION-Aesthetics Dataset (approximately 1.2B images), which was curated from LAION-5B using LAP. We find that the LAP disproportionally filters in images with captions mentioning women, while filtering out images with captions mentioning men or LGBTQ+ people. Then, we used LAP to score approximately 330k images across two art datasets, finding the model rates realistic images of landscapes, cityscapes, and portraits from western and Japanese artists most highly. In doing so, the algorithmic gaze of this aesthetic evaluation model reinforces the imperial and male gazes found within western art history. In order to understand where these biases may have originated, we performed a digital ethnography of public materials related to the creation of LAP. We find that the development of LAP reflects the biases we found in our audits, such as the aesthetic scores used to train LAP primarily coming from English-speaking photographers and western AI-enthusiasts. In response, we discuss how aesthetic evaluation can perpetuate representational harms and call on AI developers to shift away from prescriptive measures of "aesthetics" toward more pluralistic evaluation.

翻译：视觉生成式AI模型通常采用"一刀切"的美学吸引力度量进行训练。然而，"美学"判断本质上与个人品味和文化价值观紧密相连，这引发了"视觉生成式AI模型究竟代表了谁的审美"的质疑。本研究针对广泛应用于视觉生成图像模型（如Stable Diffusion）训练数据筛选及AI生成图像质量评估的美学评价模型——LAION美学预测器（LAP）展开分析。为探究LAP的度量本质，我们通过三个数据集对该模型进行审计。首先，我们考察了美学过滤对LAION美学数据集（约12亿张图像）的影响，该数据集是使用LAP从LAION-5B中筛选得到的。研究发现，LAP会不成比例地保留标题提及女性的图像，同时过滤掉标题提及男性或LGBTQ+群体的图像。随后，我们使用LAP对两个艺术数据集中的约33万张图像进行评分，发现该模型对西方和日本艺术家的风景、城市景观及肖像类写实作品评分最高。这种美学评价模型的算法凝视，实质上强化了西方艺术史中存在的帝国主义凝视与男性凝视。为追溯这些偏见的起源，我们对LAP创建相关的公开资料进行了数字民族志研究。研究发现，LAP的开发过程反映了审计中发现的偏见，例如用于训练LAP的美学评分主要来自英语国家摄影师和西方AI爱好者。基于此，我们探讨了美学评价如何延续表征性伤害，并呼吁AI开发者从规定性的"美学"度量转向更具多元性的评估体系。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

视觉中的生成物理人工智能：综述

专知会员服务

36+阅读 · 2025年1月26日

Sora背后的技术，最新《可控生成与文本到图像扩散模型》综述

专知会员服务

69+阅读 · 2024年3月9日