This paper introduces the Fair Fairness Benchmark (\textsf{FFB}), a benchmarking framework for in-processing group fairness methods. Ensuring fairness in machine learning is critical for ethical and legal compliance. However, there exist challenges in comparing and developing of fairness methods due to inconsistencies in experimental settings, lack of accessible algorithmic implementations, and limited extensibility of current fairness packages and tools. To address these issues, we introduce an open-source, standardized benchmark for evaluating in-processing group fairness methods and provide a comprehensive analysis of state-of-the-art methods to ensure different notions of group fairness. This work offers the following key contributions: the provision of flexible, extensible, minimalistic, and research-oriented open-source code; the establishment of unified fairness method benchmarking pipelines; and extensive benchmarking, which yields key insights from $\mathbf{45,079}$ experiments. We believe our work will significantly facilitate the growth and development of the fairness research community. The benchmark, including code and running logs, is available at https://github.com/ahxt/fair_fairness_benchmark
翻译:本文介绍了公平公平性基准(\textsf{FFB}),这是一个用于评估处理中群体公平性方法的基准框架。确保机器学习中的公平性对于道德合规和法律合规至关重要。然而,由于实验设置不一致、算法实现缺乏可获取性以及现有公平性包和工具的可扩展性有限,公平性方法的比较与开发存在诸多挑战。为解决这些问题,我们引入了一个开源、标准化的基准来评估处理中群体公平性方法,并提供了对最先进方法的全面分析,以确保不同群体公平性概念得到满足。本工作的主要贡献包括:提供灵活、可扩展、简洁且面向研究的开源代码;建立统一的公平性方法基准测试流程;以及基于$\mathbf{45,079}$次实验的广泛基准测试,从中获得了关键见解。我们相信,本工作将显著促进公平性研究社区的发展与成长。该基准(包括代码和运行日志)可访问 https://github.com/ahxt/fair_fairness_benchmark 获取。