Image search and retrieval tasks can perpetuate harmful stereotypes, erase cultural identities, and amplify social disparities. Current approaches to mitigate these representational harms balance the number of retrieved items across population groups defined by a small number of (often binary) attributes. However, most existing methods overlook intersectional groups determined by combinations of group attributes, such as gender, race, and ethnicity. We introduce Multi-Group Proportional Representation (MPR), a novel metric that measures representation across intersectional groups. We develop practical methods for estimating MPR, provide theoretical guarantees, and propose optimization algorithms to ensure MPR in retrieval. We demonstrate that existing methods optimizing for equal and proportional representation metrics may fail to promote MPR. Crucially, our work shows that optimizing MPR yields more proportional representation across multiple intersectional groups specified by a rich function class, often with minimal compromise in retrieval accuracy.
翻译:图像搜索与检索任务可能固化有害刻板印象、抹除文化身份认同并加剧社会不平等。现有缓解这类表征危害的方法通常基于少量(常为二元)属性定义的人口群体来平衡检索结果数量。然而,多数现有方法忽视了由群体属性(如性别、种族、民族)组合形成的交叉群体。本文提出多群体比例代表性这一衡量交叉群体表征水平的新颖度量指标。我们开发了MPR的实用估计算法,提供理论保证,并提出确保检索结果满足MPR的优化算法。实验证明,现有针对平等与比例代表性度量的优化方法可能无法促进MPR。关键的是,本研究显示优化MPR能够在由丰富函数类指定的多重交叉群体中实现更均衡的比例代表性,且通常对检索准确性的影响极小。