What Is Fairness? On the Role of Protected Attributes and Fictitious Worlds

A growing body of literature in fairness-aware ML (fairML) aspires to mitigate machine learning (ML)-related unfairness in automated decision-making (ADM) by defining metrics that measure fairness of an ML model and by proposing methods that ensure that trained ML models achieve low values in those metrics. However, the underlying concept of fairness, i.e., the question of what fairness is, is rarely discussed, leaving a considerable gap between centuries of philosophical discussion and recent adoption of the concept in the ML community. In this work, we try to bridge this gap by formalizing a consistent concept of fairness and by translating the philosophical considerations into a formal framework for the training and evaluation of ML models in ADM systems. We derive that fairness problems can already arise without the presence of protected attributes (PAs), pointing out that fairness and predictive performance are not irreconcilable counterparts, but rather that the latter is necessary to achieve the former. Moreover, we argue why and how causal considerations are necessary when assessing fairness in the presence of PAs by proposing a fictitious, normatively desired (FiND) world where the PAs have no causal effects. In practice, this FiND world must be approximated by a warped world, for which the causal effects of the PAs must be removed from the real-world data. Eventually, we achieve greater linguistic clarity for the discussion of fairML. We propose first algorithms for practical applications and present illustrative experiments on COMPAS data.

翻译：在公平感知机器学习（fairML）领域，越来越多的文献致力于通过定义衡量机器学习模型公平性的指标，并提出确保训练模型在这些指标上达到低值的方法，来减轻自动化决策（ADM）中与机器学习相关的不公平。然而，公平性的基本概念——即“公平是什么”这一问题——很少被讨论，这导致哲学领域长达几个世纪的讨论与机器学习社区最近对这一概念的采纳之间存在巨大鸿沟。本文尝试通过形式化一个一致的公平性概念，并将哲学思考转化为用于ADM系统中机器学习模型训练与评估的形式化框架，来弥合这一鸿沟。我们推导出，即使在没有保护属性（PAs）的情况下，公平性问题也可能出现，这表明公平性与预测性能并非不可调和的矛盾体，相反，后者是实现前者的必要条件。此外，我们论证了在存在保护属性时评估公平性为何以及如何需要考虑因果因素，为此提出了一个虚构的、规范性期望（FiND）世界，在该世界中保护属性没有因果效应。在实践中，这个FiND世界必须通过一个扭曲世界来近似，即从现实世界数据中移除保护属性的因果效应。最终，我们为公平感知机器学习的讨论实现了更清晰的语言表述。我们提出了首个面向实际应用的算法，并在COMPAS数据上进行了说明性实验。