Fairness in Aggregation: Optimal Top-$k$ and Improved Full Ranking

Ensuring fairness in algorithmic ranking systems is a critical challenge with significant societal implications for hiring, recommendations, web search, and data management. Standard methods for aggregating multiple preference orders into a consensus ranking may perpetuate and even amplify the lack of representation of underrepresented groups. To address this, recent research has focused on incorporating fairness constraints to ensure the presence of different groups in the top-$k$ positions of the final aggregate ranking. We study two fairness-aware variants under the well-known Spearman footrule, which corresponds to the $L_1$ distance between rankings. First, we address the practically salient task of computing a fair aggregate top-$k$ ranking -- crucial in settings like recommendations and hiring where selection is primarily based on the top-$k$ results -- and present the first optimal algorithm for this problem. Second, we consider fair (full) rank aggregation over all candidates (not specifically on top-$k$). We already know of a $3$-approximation for this fair rank aggregation variant (Wei et al., SIGMOD'22; Chakraborty et al., NeurIPS'22), whereas an exact algorithm exists for the corresponding unconstrained (unfair) version (Dwork et al., WWW'01). Closing the computational gap between fair and unconstrained rank aggregation has remained a tantalizing open problem. We make significant progress by giving a $2$-approximation algorithm for fair (full) rank aggregation, improving substantially over the previous $3$-approximation. Further, we complement our theoretical contributions with experiments on different real-world datasets, which corroborate our theoretical results and demonstrate strong empirical performance relative to state-of-the-art baselines.

翻译：确保算法排序系统中的公平性是一个关键挑战，对招聘、推荐、网络搜索和数据管理具有显著的社会影响。将多个偏好顺序聚合成共识排序的标准方法可能会延续甚至加剧弱势群体代表性不足的问题。为解决这一问题，近期研究聚焦于引入公平性约束，以确保不同群体在最终聚合排序的Top-$k$ 位置中存在。我们在著名的Spearman footrule（即排序之间的$L_1$距离）下研究两种考虑公平性的变体。首先，我们解决了计算公平聚合Top-$k$ 排序这一实际重要任务（这在推荐和招聘等主要基于Top-$k$ 结果进行筛选的场景中至关重要），并提出了该问题的首个最优算法。其次，我们考虑所有候选对象上的公平（完全）排序聚合（并非仅针对Top-$k$）。已知该公平排序聚合变体存在一个$3$近似算法（Wei等人，SIGMOD'22；Chakraborty等人，NeurIPS'22），而对应的无约束（非公平）版本已有精确算法（Dwork等人，WWW'01）。缩小公平与无约束排序聚合之间的计算差距一直是一个引人入胜的开放问题。我们通过给出公平（完全）排序聚合的$2$近似算法取得了重大进展，显著优于之前的$3$近似算法。此外，我们通过在不同真实数据集上的实验补充了理论贡献，实验结果证实了我们的理论结果，并展示了相对于最先进基准方法的强大实证性能。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

【SIGMOD教程】公平性排序:从价值到技术选择，120页ppt

专知会员服务

30+阅读 · 2023年7月13日

【MIT博士论文】序列决策中的算法公平性，134页pdf

专知会员服务

25+阅读 · 2023年5月20日

【KDD 2021】算法公平性解释框架FACTS

专知会员服务

24+阅读 · 2021年8月27日

【WWW2021】动态排序学习最大化边际公平性

专知会员服务

14+阅读 · 2021年3月13日