Randomized rankings have been of recent interest to achieve ex-ante fairer exposure and better robustness than deterministic rankings. We propose a set of natural axioms for randomized group-fair rankings and prove that there exists a unique distribution $D$ that satisfies our axioms and is supported only over ex-post group-fair rankings, i.e., rankings that satisfy given lower and upper bounds on group-wise representation in the top-$k$ ranks. Our problem formulation works even when there is implicit bias, incomplete relevance information, or only ordinal ranking is available instead of relevance scores or utility values. We propose two algorithms to sample a random group-fair ranking from the distribution $D$ mentioned above. Our first dynamic programming-based algorithm samples ex-post group-fair rankings uniformly at random in time $O(k^2\ell)$, where $\ell$ is the number of groups. Our second random walk-based algorithm samples ex-post group-fair rankings from a distribution $\delta$-close to $D$ in total variation distance and has expected running time $O^*(k^2\ell^2)$, when there is a sufficient gap between the given upper and lower bounds on the group-wise representation. The former does exact sampling, but the latter runs significantly faster on real-world data sets for larger values of $k$. We give empirical evidence that our algorithms compare favorably against recent baselines for fairness and ranking utility on real-world data sets.
翻译:随机排名方法近年来备受关注,相较于确定性排名,它在实现事前更公平曝光和更强鲁棒性方面具有优势。我们提出了一组面向随机群体公平排名的公理化约束,并证明了存在唯一满足这些公理的分布$D$,且该分布仅支持事后群体公平排名,即满足前$k$名中群体代表性给定上下界约束的排名。我们的问题建模框架甚至适用于存在隐性偏差、不完整相关性信息、或仅有序数排名而无相关性分数或效用值的情形。我们提出两种从分布$D$中采样随机群体公平排名的算法:第一种基于动态规划,能以$O(k^2\ell)$时间复杂度均匀随机采样事后群体公平排名(其中$\ell$为群体数量);第二种基于随机游走算法,当给定群体代表性上下界存在足够差距时,能以总变差距离$\delta$-接近分布$D$的方式生成采样,期望运行时间为$O^*(k^2\ell^2)$。前者实现精确采样,而后者在更大$k$值的真实数据集上运行速度显著提升。通过实证分析,我们证明所提算法在公平性与排名效用的综合指标上优于当前主流基线方法。