The Algorithmic Gaze: An Audit and Ethnography of the LAION-Aesthetics Predictor Model

Visual generative AI models are trained using a one-size-fits-all measure of aesthetic appeal. However, what is deemed "aesthetic" is inextricably linked to personal taste and cultural values, raising the question of whose taste is represented in visual generative AI models. In this work, we study an aesthetic evaluation model--LAION Aesthetic Predictor (LAP)--that is widely used to curate datasets to train visual generative image models, like Stable Diffusion, and evaluate the quality of AI-generated images. To understand what LAP measures, we audited the model across three datasets. First, we examined the impact of aesthetic filtering on the LAION-Aesthetics Dataset (approximately 1.2B images), which was curated from LAION-5B using LAP. We find that the LAP disproportionally filters in images with captions mentioning women, while filtering out images with captions mentioning men or LGBTQ+ people. Then, we used LAP to score approximately 330k images across two art datasets, finding the model rates realistic images of landscapes, cityscapes, and portraits from western and Japanese artists most highly. In doing so, the algorithmic gaze of this aesthetic evaluation model reinforces the imperial and male gazes found within western art history. In order to understand where these biases may have originated, we performed a digital ethnography of public materials related to the creation of LAP. We find that the development of LAP reflects the biases we found in our audits, such as the aesthetic scores used to train LAP primarily coming from English-speaking photographers and western AI-enthusiasts. In response, we discuss how aesthetic evaluation can perpetuate representational harms and call on AI developers to shift away from prescriptive measures of "aesthetics" toward more pluralistic evaluation.

翻译：视觉生成式AI模型通常采用一种"一刀切"的审美吸引力度量进行训练。然而，"审美"标准与个人品味及文化价值观密不可分，这引发了视觉生成式AI模型究竟代表谁之审美的问题。本研究针对广泛应用于视觉生成图像模型（如Stable Diffusion）训练数据筛选及AI生成图像质量评估的审美评价模型——LAION审美预测器（LAP）展开分析。为探究LAP的度量本质，我们通过三个数据集对该模型进行审计。首先，我们考察了审美过滤对LAION-Aesthetics数据集（约12亿张图像）的影响，该数据集是使用LAP从LAION-5B中筛选得到的。研究发现，LAP会不成比例地筛选入标题提及女性的图像，同时过滤掉标题提及男性或LGBTQ+群体的图像。随后，我们使用LAP对两个艺术数据集中的约33万张图像进行评分，发现该模型对西方和日本艺术家的风景、城市景观及肖像类写实作品评分最高。这种审美评价模型的算法凝视，实则强化了西方艺术史中存在的帝国主义凝视与男性凝视。为追溯这些偏见的起源，我们对LAP创建相关的公开材料进行了数字民族志研究。发现LAP的开发过程反映了审计中发现的偏见，例如用于训练LAP的审美评分主要来自英语摄影师和西方AI爱好者。基于此，我们探讨了审美评价如何延续表征性危害，并呼吁AI开发者从规定性的"审美"度量转向更具多元性的评价体系。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

144页ppt《扩散模型》，Google DeepMind Sander Dieleman

专知会员服务

51+阅读 · 2025年11月21日

基于神经网络的图像风格迁移算法综述

专知会员服务

12+阅读 · 2025年5月29日

高效视觉语言模型研究综述

专知会员服务

14+阅读 · 2025年4月18日

视觉中的生成物理人工智能：综述

专知会员服务

39+阅读 · 2025年1月26日