为Top-$k$选择寻找公平评分函数：从计算复杂性到实践 (Finding a Fair Scoring Function for Top-$k$ Selection: From Hardness to Practice)

Selecting a subset of the $k$ "best" items from a dataset of $n$ items, based on a scoring function, is a key task in decision-making. Given the rise of automated decision-making software, it is important that the outcome of this process, called top-$k$ selection, is fair. Here we consider the problem of identifying a fair linear scoring function for top-$k$ selection. The function computes a score for each item as a weighted sum of its (numerical) attribute values, and must ensure that the selected subset includes adequate representation of a minority or historically disadvantaged group. Existing algorithms do not scale efficiently, particularly in higher dimensions. Our hardness analysis shows that in more than two dimensions, no algorithm is likely to achieve good scalability with respect to dataset size, and the computational complexity is likely to increase rapidly with dimensionality. However, the hardness results also provide key insights guiding algorithm design, leading to our two-pronged solution: (1) For small values of $k$, our hardness analysis reveals a gap in the hardness barrier. By addressing various engineering challenges, including achieving efficient parallelism, we turn this potential of efficiency into an optimized algorithm delivering substantial practical performance gains. (2) For large values of $k$, where the hardness is robust, we employ a practically efficient algorithm which, despite being theoretically worse, achieves superior real-world performance. Experimental evaluations on real-world datasets then explore scenarios where worst-case behavior does not manifest, identifying areas critical to practical performance. Our solution achieves speed-ups of up to several orders of magnitude compared to SOTA, an efficiency made possible through a tight integration of hardness analysis, algorithm design, practical engineering, and empirical evaluation.

翻译：基于评分函数从包含 $n$ 个条目的数据集中选择 $k$ 个“最佳”条目的子集，是决策过程中的一项关键任务。随着自动化决策软件的兴起，确保这一被称为 top-$k$ 选择的过程的结果是公平的，变得尤为重要。本文研究为 top-$k$ 选择识别一个公平的线性评分函数的问题。该函数通过计算条目（数值）属性值的加权和来得到每个条目的分数，并且必须确保所选子集充分包含少数群体或历史上处于不利地位的群体的代表性。现有算法无法高效扩展，尤其是在更高维度上。我们的计算复杂性分析表明，在超过两个维度的情况下，任何算法都难以在数据集规模方面实现良好的可扩展性，并且计算复杂度很可能随维度的增加而迅速增长。然而，这些复杂性结果也为算法设计提供了关键见解，从而引导我们提出了双管齐下的解决方案：(1) 对于较小的 $k$ 值，我们的复杂性分析揭示了复杂性障碍中存在一个缺口。通过解决各种工程挑战，包括实现高效的并行化，我们将这种潜在的效率可能性转化为一种优化的算法，带来了显著的实践性能提升。(2) 对于较大的 $k$ 值，其复杂性是稳健的，我们采用了一种在实践中高效的算法，该算法尽管在理论上性能较差，但在现实世界中实现了卓越的性能。随后，我们在真实世界数据集上的实验评估探索了最坏情况行为未出现的场景，识别出对实际性能至关重要的领域。与最先进技术相比，我们的解决方案实现了高达数个数量级的加速，这种效率是通过将复杂性分析、算法设计、工程实践和实证评估紧密结合而实现的。