The Unfairness of Fair Machine Learning: Levelling down and strict egalitarianism by default

In recent years fairness in machine learning (ML) has emerged as a highly active area of research and development. Most define fairness in simple terms, where fairness means reducing gaps in performance or outcomes between demographic groups while preserving as much of the accuracy of the original system as possible. This oversimplification of equality through fairness measures is troubling. Many current fairness measures suffer from both fairness and performance degradation, or "levelling down," where fairness is achieved by making every group worse off, or by bringing better performing groups down to the level of the worst off. When fairness can only be achieved by making everyone worse off in material or relational terms through injuries of stigma, loss of solidarity, unequal concern, and missed opportunities for substantive equality, something would appear to have gone wrong in translating the vague concept of 'fairness' into practice. This paper examines the causes and prevalence of levelling down across fairML, and explore possible justifications and criticisms based on philosophical and legal theories of equality and distributive justice, as well as equality law jurisprudence. We find that fairML does not currently engage in the type of measurement, reporting, or analysis necessary to justify levelling down in practice. We propose a first step towards substantive equality in fairML: "levelling up" systems by design through enforcement of minimum acceptable harm thresholds, or "minimum rate constraints," as fairness constraints. We likewise propose an alternative harms-based framework to counter the oversimplified egalitarian framing currently dominant in the field and push future discussion more towards substantive equality opportunities and away from strict egalitarianism by default. N.B. Shortened abstract, see paper for full abstract.

翻译：近年来，机器学习（ML）中的公平性已成为一个高度活跃的研究与发展领域。大多数定义将公平性简化为：在尽可能保持原始系统准确性的前提下，减少不同人口群体之间的性能或结果差距。这种通过公平性度量对平等的过度简化令人担忧。当前许多公平性度量存在公平性与性能双重退化，即“降级均衡”——通过使所有群体状况恶化，或通过将表现较好的群体拉低至最差群体的水平来实现公平性。当公平性只能通过使所有人在物质或关系层面上付出更大代价来达成（伴随污名化伤害、团结性丧失、不平等关注以及错失实质性平等机会时），这显然表明在将模糊的“公平性”概念转化为实践的过程中出现了偏差。本文考察了公平性机器学习中降级均衡的成因与普遍性，并基于平等与分配正义的哲学与法律理论及平等法判例，探讨其可能的正当性依据与批判。我们发现，当前公平性机器学习尚未采用必要的测量、报告或分析方法来证成实践中的降级均衡。我们提出实现公平性机器学习实质性平等的首要步骤：通过设定最低可接受危害阈值（即“最低速率约束”）作为公平性约束，从设计层面实现“升级均衡”。我们还提出一种替代性的、基于危害的框架，以对抗当前该领域盛行的过度简化的平等主义框架，推动未来讨论更多地关注实质性平等机会，而非默认的严格平等主义。注：此为缩短版摘要，完整版请参见论文原文。