Fairness in Image Search: A Study of Occupational Stereotyping in Image Retrieval and its Debiasing

Multi-modal search engines have experienced significant growth and widespread use in recent years, making them the second most common internet use. While search engine systems offer a range of services, the image search field has recently become a focal point in the information retrieval community, as the adage goes, "a picture is worth a thousand words". Although popular search engines like Google excel at image search accuracy and agility, there is an ongoing debate over whether their search results can be biased in terms of gender, language, demographics, socio-cultural aspects, and stereotypes. This potential for bias can have a significant impact on individuals' perceptions and influence their perspectives. In this paper, we present our study on bias and fairness in web search, with a focus on keyword-based image search. We first discuss several kinds of biases that exist in search systems and why it is important to mitigate them. We narrow down our study to assessing and mitigating occupational stereotypes in image search, which is a prevalent fairness issue in image retrieval. For the assessment of stereotypes, we take gender as an indicator. We explore various open-source and proprietary APIs for gender identification from images. With these, we examine the extent of gender bias in top-tanked image search results obtained for several occupational keywords. To mitigate the bias, we then propose a fairness-aware re-ranking algorithm that optimizes (a) relevance of the search result with the keyword and (b) fairness w.r.t genders identified. We experiment on 100 top-ranked images obtained for 10 occupational keywords and consider random re-ranking and re-ranking based on relevance as baselines. Our experimental results show that the fairness-aware re-ranking algorithm produces rankings with better fairness scores and competitive relevance scores than the baselines.

翻译：多模态搜索引擎近年来经历了显著增长和广泛使用，已成为互联网第二大常见用途。尽管搜索引擎系统提供了一系列服务，但图像搜索领域近期已成为信息检索领域的关注焦点，正如谚语所说"一图胜千言"。虽然谷歌等流行搜索引擎在图像搜索准确性和敏捷性方面表现出色，但其搜索结果是否可能产生关于性别、语言、人口统计、社会文化方面以及刻板印象的偏差，这一直是持续争论的话题。这种潜在的偏差可能对个人认知产生重大影响，并影响其观点。本文重点研究基于关键词的图像搜索中的偏差与公平性问题。我们首先讨论了搜索系统中存在的几种偏差类型及其缓解的重要性。随后将研究聚焦于评估和缓解图像搜索中的职业刻板印象——这是图像检索中普遍存在的公平性问题。在刻板印象评估中，我们以性别作为指标，探索了多种用于图像性别识别的开源和专有API。通过使用这些工具，我们检测了多个职业关键词获取的Top排名图像搜索结果中的性别偏差程度。为缓解偏差，我们提出了一种公平感知重排序算法，该算法同时优化（a）搜索结果与关键词的相关性和（b）已识别性别间的公平性。我们以10个职业关键词获取的100张Top排名图像为实验对象，将随机重排序和基于相关性的重排序作为基线。实验结果表明，与基线相比，公平感知重排序算法在产生更优公平性得分的同时保持了具有竞争力的相关性得分。