To mitigate the bias exhibited by machine learning models, fairness criteria can be integrated into the training process to ensure fair treatment across all demographics, but it often comes at the expense of model performance. Understanding such tradeoffs, therefore, underlies the design of fair algorithms. To this end, this paper provides a complete characterization of the inherent tradeoff of demographic parity on classification problems, under the most general multi-group, multi-class, and noisy setting. Specifically, we show that the minimum error rate achievable by randomized and attribute-aware fair classifiers is given by the optimal value of a Wasserstein-barycenter problem. On the practical side, our findings lead to a simple post-processing algorithm that derives fair classifiers from score functions, which yields the optimal fair classifier when the score is Bayes optimal. We provide suboptimality analysis and sample complexity for our algorithm, and demonstrate its effectiveness on benchmark datasets.
翻译:为了减轻机器学习模型所展现的偏见,可以将公平性标准整合到训练过程中,以确保在所有人口群体中都能得到公平对待,但这往往以牺牲模型性能为代价。因此,理解这种权衡是设计公平算法的基础。为此,本文在最一般的多群体、多类别和含噪设定下,完整刻画了分类问题中人口统计均等的内在权衡。具体来说,我们证明了随机且属性感知的公平分类器所能达到的最小错误率由Wasserstein重心问题的最优值给出。在实践层面,我们的发现催生了一种简单的后处理算法,该算法从评分函数中推导出公平分类器,并在评分函数为贝叶斯最优时生成最优公平分类器。我们对该算法进行了次优性分析和样本复杂度分析,并在基准数据集上展示了其有效性。