We consider the problem of subset selection where one is given multiple rankings of items and the goal is to select the highest ``quality'' subset. Score functions from the multiwinner voting literature have been used to aggregate rankings into quality scores for subsets. We study this setting of subset selection problems when, in addition, rankings may contain systemic or unconscious biases toward a group of items. For a general model of input rankings and biases, we show that requiring the selected subset to satisfy group fairness constraints can improve the quality of the selection with respect to unbiased rankings. Importantly, we show that for fairness constraints to be effective, different multiwinner score functions may require a drastically different number of rankings: While for some functions, fairness constraints need an exponential number of rankings to recover a close-to-optimal solution, for others, this dependency is only polynomial. This result relies on a novel notion of ``smoothness'' of submodular functions in this setting that quantifies how well a function can ``correctly'' assess the quality of items in the presence of bias. The results in this paper can be used to guide the choice of multiwinner score functions for the subset selection setting considered here; we additionally provide a tool to empirically enable this.
翻译:我们考虑子集选择问题,其中给定多个项目排序,目标是最优子集的选取。多赢家投票文献中的评分函数已被用于将排序聚合为子集的质量评分。我们研究当排序可能对某类项目存在系统性或无意识偏差时的子集选择问题。针对输入排序与偏差的一般模型,我们证明要求所选子集满足群体公平约束能够提升基于无偏排序的选择质量。重要地,我们发现公平约束的有效性取决于多赢家评分函数对排序数量的要求:某些函数需要指数级数量的排序才能恢复接近最优解,而其他函数仅需多项式级依赖。这一结果基于本文首次提出的子模函数“平滑性”概念,该概念量化了函数在存在偏差时“正确”评估项目质量的能力。本文结论可为该场景下多赢家评分函数的选择提供理论指导;此外我们还提供了支持实证选择的工具。