Explaining Rankings with Hidden Group Bonuses

Determining a linear utility function that correlates with observed candidate rankings is a foundational problem with applications in domains such as admissions, hiring, and recommendation systems, e.g., [Storandt and Funke, AAAI'19, Zhang et al., KDD'23, Wang et al., ICDE'24 (best paper award), Chen and Wong, VLDB'24]. Traditionally, these models assume full visibility into the feature sets used to determine the utility score. However, real-world scenarios often involve sensitive attributes that are hidden or partially observed, yet may influence outcomes through additive bonuses designed to promote fairness, as in [Gale and Marian, ICDE'24]. Motivated by such practical concerns, we study a variant of the ranking explanation problem where sensitive features are unobserved but may influence candidate rankings through group-specific linear boosts. We present a formal framework for modeling this problem and develop an algorithmic solution that leverages constraint satisfaction and automated reasoning techniques to jointly infer the linear scoring parameters and latent group bonuses consistent with the observed rankings. We further show that determining a satisfying linear function with group-specific bonuses is \textsf{NP}-hard in general, but when the feature dimension and the number of groups are constant, the problem admits a polynomial-time solution. Our approach is the first to address this nuanced variant, which captures key real-world challenges in fair ranking and admission systems. We perform extensive experiments on both real-world and synthetic datasets, demonstrating that our method effectively recovers hidden bonus structures and provides faithful explanations of observed ranking outcomes.

翻译：确定与观察到的候选排名相关的线性效用函数是一个基础性问题，在招生、招聘和推荐系统等领域具有广泛应用，例如 [Storandt and Funke, AAAI'19, Zhang et al., KDD'23, Wang et al., ICDE'24 (最佳论文奖), Chen and Wong, VLDB'24]。传统上，这些模型假设用于确定效用分数的特征集完全可见。然而，现实场景中往往涉及隐藏或部分可观察的敏感属性，但这些属性可能通过旨在促进公平的附加奖励影响结果，如 [Gale and Marian, ICDE'24] 所述。受此类实际问题的驱动，我们研究了排名解释问题的一个变体，其中敏感特征未被观察到，但可能通过特定群体的线性提升影响候选排名。我们提出了一个形式化框架来建模该问题，并开发了一种利用约束满足和自动推理技术的算法解决方案，以联合推断与观察到的排名一致的线性评分参数和潜在群体奖励。我们进一步证明，确定具有群体特定奖励的满足性线性函数通常是 \textsf{NP}-难的，但当特征维度和群体数量为常数时，该问题存在多项式时间解。我们的方法是首个解决这一细微变体的方案，该变体捕捉了公平排名和招生系统中的关键现实挑战。我们在真实数据集和合成数据集上进行了大量实验，结果表明我们的方法能有效恢复隐藏的奖励结构，并对观察到的排名结果提供可信的解释。