Algorithmic fairness is a socially crucial topic in real-world applications of AI. Among many notions of fairness, subgroup fairness is widely studied when multiple sensitive attributes (e.g., gender, race, age) are present. However, as the number of sensitive attributes grows, the number of subgroups increases accordingly, creating heavy computational burdens and data sparsity problem (subgroups with too small sizes). In this paper, we develop a novel learning algorithm for subgroup fairness which resolves these issues by focusing on subgroups with sufficient sample sizes as well as marginal fairness (fairness for each sensitive attribute). To this end, we formalize a notion of subgroup-subset fairness and introduce a corresponding distributional fairness measure called the supremum Integral Probability Metric (supIPM). Building on this formulation, we propose the Doubly Regressing Adversarial learning for subgroup Fairness (DRAF) algorithm, which reduces a surrogate fairness gap for supIPM with much less computation than directly reducing supIPM. Theoretically, we prove that the proposed surrogate fairness gap is an upper bound of supIPM. Empirically, we show that the DRAF algorithm outperforms baseline methods in benchmark datasets, specifically when the number of sensitive attributes is large so that many subgroups are very small.
翻译:算法公平性是人工智能实际应用中至关重要的社会性议题。在众多公平性定义中,当存在多个敏感属性(如性别、种族、年龄)时,子群公平性被广泛研究。然而,随着敏感属性数量的增加,子群数量相应增长,导致计算负担加重和数据稀疏问题(某些子群样本量过小)。本文提出了一种新颖的子群公平性学习算法,通过聚焦于具有足够样本量的子群以及边际公平性(每个敏感属性的公平性)来解决这些问题。为此,我们形式化了子群子集公平性的概念,并引入相应的分布公平性度量——上确界积分概率度量(supIPM)。基于此框架,我们提出了用于子群公平性的双重回归对抗学习(DRAF)算法,该算法通过减少代理公平性差距来降低supIPM,其计算量远小于直接优化supIPM。理论上,我们证明了所提出的代理公平性差距是supIPM的上界。实证结果表明,在基准数据集上,当敏感属性数量较多导致大量子群规模过小时,DRAF算法的性能优于基线方法。