Text-to-Image (T2I) generation models have been widely adopted across various industries, yet are criticized for frequently exhibiting societal stereotypes. While a growing body of research has emerged to evaluate and mitigate these biases, the field at present contends with conceptual ambiguity, for example terms like "bias" and "fairness" are not always clearly distinguished and often lack clear operational definitions. This paper provides a comprehensive systematic review of T2I fairness literature, organizing existing work into a taxonomy of bias types and fairness notions. We critically assess the gap between "target fairness" (normative ideals in T2I outputs) and "threshold fairness" (normative standards with actionable decision rules). Furthermore, we survey the landscape of mitigation strategies, ranging from prompt engineering to diffusion process manipulation. We conclude by proposing a new framework for operationalizing fairness that moves beyond descriptive metrics towards rigorous, target-based testing, offering an approach for more accountable generative AI development.
翻译:文本到图像(T2I)生成模型已在各行各业得到广泛应用,然而因其频繁展现社会刻板印象而受到批评。尽管已有大量研究致力于评估和缓解这些偏见,但该领域目前仍面临概念模糊的问题,例如“偏见”与“公平性”等术语并非总能得到清晰区分,并且常常缺乏明确的操作性定义。本文对T2I公平性文献进行了系统性综述,将现有工作组织为偏见类型与公平性概念的分类体系。我们批判性地评估了“目标公平性”(T2I输出中的规范性理想)与“阈值公平性”(包含可操作决策规则的规范性标准)之间的差距。此外,我们梳理了从提示工程到扩散过程操作等各类缓解策略。最后,我们提出了一种新的公平性实施框架,该框架超越描述性指标,转向基于目标的严格测试,为更负责任的人工智能生成开发提供了可行路径。