Most research on fair machine learning has prioritized optimizing criteria such as Demographic Parity and Equalized Odds. Despite these efforts, there remains a limited understanding of how different bias mitigation strategies affect individual predictions and whether they introduce arbitrariness into the debiasing process. This paper addresses these gaps by exploring whether models that achieve comparable fairness and accuracy metrics impact the same individuals and mitigate bias in a consistent manner. We introduce the FRAME (FaiRness Arbitrariness and Multiplicity Evaluation) framework, which evaluates bias mitigation through five dimensions: Impact Size (how many people were affected), Change Direction (positive versus negative changes), Decision Rates (impact on models' acceptance rates), Affected Subpopulations (who was affected), and Neglected Subpopulations (where unfairness persists). This framework is intended to help practitioners understand the impacts of debiasing processes and make better-informed decisions regarding model selection. Applying FRAME to various bias mitigation approaches across key datasets allows us to exhibit significant differences in the behaviors of debiasing methods. These findings highlight the limitations of current fairness criteria and the inherent arbitrariness in the debiasing process.
翻译:大多数关于公平机器学习的研究都优先优化人口统计均等和机会均等等标准。尽管付出了这些努力,人们对不同偏见缓解策略如何影响个体预测以及它们是否会给去偏过程引入任意性仍然理解有限。本文通过探究达到可比公平性与准确性指标的模型是否影响相同个体并以一致方式缓解偏见,来填补这些研究空白。我们提出了FRAME(公平性任意性与多重性评估)框架,该框架通过五个维度评估偏见缓解:影响规模(多少人受到影响)、变化方向(积极变化与消极变化)、决策率(对模型接受率的影响)、受影响子群体(谁受到影响)以及被忽视子群体(不公平性持续存在的群体)。该框架旨在帮助从业者理解去偏过程的影响,并就模型选择做出更明智的决策。将FRAME应用于关键数据集上的多种偏见缓解方法,使我们能够展示不同去偏方法行为上的显著差异。这些发现突显了当前公平性标准的局限性以及去偏过程中固有的任意性。