Uncovering Discrimination Clusters: Quantifying and Explaining Systematic Fairness Violations

Fairness in algorithmic decision-making is often framed in terms of individual fairness, which requires that similar individuals receive similar outcomes. A system violates individual fairness if there exists a pair of inputs differing only in protected attributes (such as race or gender) that lead to significantly different outcomes-for example, one favorable and the other unfavorable. While this notion highlights isolated instances of unfairness, it fails to capture broader patterns of systematic or clustered discrimination that may affect entire subgroups. We introduce and motivate the concept of discrimination clustering, a generalization of individual fairness violations. Rather than detecting single counterfactual disparities, we seek to uncover regions of the input space where small perturbations in protected features lead to k-significantly distinct clusters of outcomes. That is, for a given input, we identify a local neighborhood-differing only in protected attributes-whose members' outputs separate into many distinct clusters. These clusters reveal significant arbitrariness in treatment solely based on protected attributes that help expose patterns of algorithmic bias that elude pairwise fairness checks. We present HyFair, a hybrid technique that combines formal symbolic analysis (via SMT and MILP solvers) to certify individual fairness with randomized search to discover discriminatory clusters. This combination enables both formal guarantees-when no counterexamples exist-and the detection of severe violations that are computationally challenging for symbolic methods alone. Given a set of inputs exhibiting high k-unfairness, we introduce a novel explanation method to generate interpretable, decision-tree-style artifacts. Our experiments demonstrate that HyFair outperforms state-of-the-art fairness verification and local explanation methods.

翻译：算法决策中的公平性常以个体公平性为框架，即要求相似个体获得相似结果。若存在仅因受保护属性（如种族或性别）不同而导致显著差异结果（例如一个有利、另一个不利）的输入对，则系统违反了个体公平性。尽管这一概念凸显了孤立的公平性违规案例，却未能捕捉可能影响整个子群的系统性或聚类性歧视的广泛模式。本文提出并阐释了歧视性聚类的概念，作为个体公平性违规的广义形式。我们不仅检测单一反事实差异，更致力于揭示输入空间中受保护特征的微小扰动会导致k个显著不同结果聚类的区域。具体而言，对于给定输入，我们识别一个仅因受保护属性存在差异的局部邻域，其成员输出会分离为多个不同聚类。这些聚类揭示了仅基于受保护属性产生的显著任意性处理，有助于暴露成对公平性检查难以发现的算法偏见模式。我们提出HyFair混合技术，结合形式化符号分析（通过SMT与MILP求解器）以验证个体公平性，并利用随机搜索发现歧视性聚类。这种结合既能提供无反例时的形式化保证，又能检测对纯符号方法具有计算挑战性的严重违规。针对呈现高k值不公平性的输入集，我们引入一种新颖的解释方法以生成可解释的决策树式表征。实验表明，HyFair在公平性验证与局部解释方法方面优于现有最先进技术。