When learning to rank from user interactions, search and recommendation systems must address biases in user behavior to provide a high-quality ranking. One type of bias that has recently been studied in the ranking literature is when sensitive attributes, such as gender, have an impact on a user's judgment about an item's utility. For example, in a search for an expertise area, some users may be biased towards clicking on male candidates over female candidates. We call this type of bias group membership bias or group bias for short. Increasingly, we seek rankings that not only have high utility but are also fair to individuals and sensitive groups. Merit-based fairness measures rely on the estimated merit or utility of the items. With group bias, the utility of the sensitive groups is under-estimated, hence, without correcting for this bias, a supposedly fair ranking is not truly fair. In this paper, first, we analyze the impact of group bias on ranking quality as well as two well-known merit-based fairness metrics and show that group bias can hurt both ranking and fairness. Then, we provide a correction method for group bias that is based on the assumption that the utility score of items in different groups comes from the same distribution. This assumption has two potential issues of sparsity and equality-instead-of-equity, which we use an amortized approach to solve. We show that our correction method can consistently compensate for the negative impact of group bias on ranking quality and fairness metrics.
翻译:在从用户交互中学习排序时,搜索和推荐系统必须处理用户行为中的偏见,以提供高质量的排序结果。近年来排序文献中研究的一种偏见类型是,敏感属性(如性别)会影响用户对物品效用的判断。例如,在专业领域搜索中,部分用户可能更倾向于点击男性候选人而非女性候选人。我们将这种偏见称为群体偏见或简称为群体偏见。我们日益追求不仅具有高效用,而且对个体和敏感群体公平的排序结果。基于功绩的公平度量依赖于物品的估计功绩或效用。在群体偏见下,敏感群体的效用被低估,因此若未纠正这种偏见,所谓的公平排序并非真正公平。本文首先分析了群体偏见对排序质量以及两种著名的基于功绩的公平度量的影响,表明群体偏见会同时损害排序和公平性。随后,我们提出了一种基于不同群体中物品效用分数来自同一分布的假设的群体偏见纠正方法。该假设存在稀疏性和平等性而非公平性两个潜在问题,我们采用摊销方法加以解决。研究表明,我们的纠正方法能够持续补偿群体偏见对排序质量和公平度量产生的负面影响。