Although many fairness criteria have been proposed to ensure that machine learning algorithms do not exhibit or amplify our existing social biases, these algorithms are trained on datasets that can themselves be statistically biased. In this paper, we investigate the robustness of a number of existing (demographic) fairness criteria when the algorithm is trained on biased data. We consider two forms of dataset bias: errors by prior decision makers in the labeling process, and errors in measurement of the features of disadvantaged individuals. We analytically show that some constraints (such as Demographic Parity) can remain robust when facing certain statistical biases, while others (such as Equalized Odds) are significantly violated if trained on biased data. We also analyze the sensitivity of these criteria and the decision maker's utility to biases. We provide numerical experiments based on three real-world datasets (the FICO, Adult, and German credit score datasets) supporting our analytical findings. Our findings present an additional guideline for choosing among existing fairness criteria, or for proposing new criteria, when available datasets may be biased.
翻译:尽管已有大量公平性准则被提出以确保机器学习算法不会展现或放大我们现有的社会偏见,但这些算法所训练的模型本身可能包含统计偏差。本文研究在算法基于有偏数据训练时,若干现有(人口统计)公平性准则的鲁棒性。我们考虑了两种形式的标注数据偏差:先前决策者在标注过程中产生的错误,以及弱势群体特征测量中的误差。通过理论分析表明,部分约束(如人口统计均等性)在面对特定统计偏差时仍能保持鲁棒性,而另一些约束(如机会均等性)若基于有偏数据训练则会被显著违反。我们还分析了这些准则对偏差的敏感性以及决策者的效用。基于三个真实世界数据集(FICO、Adult和德国信用评分数据集)的数值实验验证了我们的理论发现。我们的研究为在现有数据集可能存在偏差时,如何选择现有公平性准则或提出新准则提供了额外指导。