Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You

Text-to-image generation models have recently achieved astonishing results in image quality, flexibility, and text alignment and are consequently employed in a fast-growing number of applications. Through improvements in multilingual abilities, a larger community now has access to this kind of technology. Yet, as we will show, multilingual models suffer similarly from (gender) biases as monolingual models. Furthermore, the natural expectation is that these models will provide similar results across languages, but this is not the case and there are important differences between languages. Thus, we propose a novel benchmark MAGBIG intending to foster research in multilingual models without gender bias. We investigate whether multilingual T2I models magnify gender bias with MAGBIG. To this end, we use multilingual prompts requesting portrait images of persons of a certain occupation or trait (using adjectives). Our results show not only that models deviate from the normative assumption that each gender should be equally likely to be generated, but that there are also big differences across languages. Furthermore, we investigate prompt engineering strategies, i.e. the use of indirect, neutral formulations, as a possible remedy for these biases. Unfortunately, they help only to a limited extent and result in worse text-to-image alignment. Consequently, this work calls for more research into diverse representations across languages in image generators.

翻译：文本到图像生成模型近年来在图像质量、灵活性和文本对齐方面取得了惊人成果，因此被广泛应用于快速增长的应用场景中。随着多语言能力的提升，更多用户现在能够接触到这类技术。然而，我们将证明，多语言模型与单语模型一样存在（性别）偏见问题。此外，人们自然期望这些模型在不同语言中提供相似的结果，但实际情况并非如此，不同语言之间存在显著差异。为此，我们提出了一个新的基准MAGBIG，旨在促进无性别偏见的多语言模型研究。我们利用MAGBIG探究多语言文本到图像（T2I）模型是否放大了性别偏见。具体而言，我们使用多语言提示，要求生成特定职业或特质（通过形容词描述）的人物肖像图像。我们的结果不仅表明模型偏离了每种性别应具有同等生成概率的规范性假设，还揭示了不同语言之间存在巨大差异。此外，我们研究了提示工程策略（即使用间接、中性的表述）作为缓解这些偏见的可能手段。遗憾的是，这些策略仅在有限程度上有效，并导致文本到图像对齐效果更差。因此，这项工作呼吁对图像生成器中跨语言的多样化表征进行更多研究。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/