Online platforms mediate access to opportunity: relevance-based rankings create and constrain options by allocating exposure to job openings and job candidates in hiring platforms, or sellers in a marketplace. In order to do so responsibly, these socially consequential systems employ various fairness measures and interventions, many of which seek to allocate exposure based on worthiness. Because these constructs are typically not directly observable, platforms must instead resort to using proxy scores such as relevance and infer them from behavioral signals such as searcher clicks. Yet, it remains an open question whether relevance fulfills its role as such a worthiness score in high-stakes fair rankings. In this paper, we combine perspectives and tools from the social sciences, information retrieval, and fairness in machine learning to derive a set of desired criteria that relevance scores should satisfy in order to meaningfully guide fairness interventions. We then empirically show that not all of these criteria are met in a case study of relevance inferred from biased user click data. We assess the impact of these violations on the estimated system fairness and analyze whether existing fairness interventions may mitigate the identified issues. Our analyses and results surface the pressing need for new approaches to relevance collection and generation that are suitable for use in fair ranking.
翻译:在线平台调解着对机会的获取:基于相关性的排序通过分配曝光度,在招聘平台上为职位空缺和求职者,或市场中为卖家创造并限制选择。为了负责任地做到这一点,这些具有社会影响的系统采用了各种公平性度量和干预措施,其中许多旨在根据应得性分配曝光度。由于这些概念通常无法直接观测,平台必须转而使用相关性等代理分数,并从搜索者点击等行为信号中推断它们。然而,在高风险的公平排序中,相关性是否能作为这样的应得性分数发挥作用仍是一个未解问题。在本文中,我们结合社会科学、信息检索和机器学习公平性的视角与工具,推导出相关性分数应有的一组期望标准,以便有意义地指导公平性干预。然后,我们通过一个从有偏用户点击数据推断相关性的案例研究,实证表明并非所有这些标准都得到满足。我们评估了这些违规对估计的系统公平性的影响,并分析了现有的公平性干预措施是否可能缓解识别出的问题。我们的分析和结果凸显了对适用于公平排序的相关性收集与生成新方法的迫切需求。