Algorithmic fairness in machine learning has recently garnered significant attention. However, two pressing challenges remain: (1) The fairness guarantees of existing fair classification methods often rely on specific data distribution assumptions and large sample sizes, which can lead to fairness violations when the sample size is moderate-a common situation in practice. (2) Due to legal and societal considerations, using sensitive group attributes during decision-making (referred to as the group-blind setting) may not always be feasible. In this work, we quantify the impact of enforcing algorithmic fairness and group-blindness in binary classification under group fairness constraints. Specifically, we propose a unified framework for fair classification that provides distribution-free and finite-sample fairness guarantees with controlled excess risk. This framework is applicable to various group fairness notions in both group-aware and group-blind scenarios. Furthermore, we establish a minimax lower bound on the excess risk, showing the minimax optimality of our proposed algorithm up to logarithmic factors. Through extensive simulation studies and real data analysis, we further demonstrate the superior performance of our algorithm compared to existing methods, and provide empirical support for our theoretical findings.
翻译:机器学习中的算法公平性近来受到广泛关注。然而,两个紧迫挑战依然存在:(1) 现有公平分类方法的公平性保证通常依赖于特定的数据分布假设和大样本量,当样本量适中时(实践中常见情况)可能导致公平性违反。(2) 由于法律和社会因素,在决策过程中使用敏感群体属性(称为群体盲视场景)可能并不可行。本研究量化了在群体公平约束下执行算法公平性与群体盲视对二分类问题的影响。具体而言,我们提出了一个统一的公平分类框架,该框架在控制超额风险的同时提供无分布且有限样本的公平性保证。该框架适用于群体感知与群体盲视场景下的多种群体公平性概念。此外,我们建立了超额风险的极小极大下界,证明所提算法在忽略对数因子意义下具有极小极大最优性。通过大量模拟研究和真实数据分析,我们进一步展示了所提算法相较于现有方法的优越性能,并为理论发现提供了实证支持。