FairGridSearch: A Framework to Compare Fairness-Enhancing Models

Machine learning models are increasingly used in critical decision-making applications. However, these models are susceptible to replicating or even amplifying bias present in real-world data. While there are various bias mitigation methods and base estimators in the literature, selecting the optimal model for a specific application remains challenging. This paper focuses on binary classification and proposes FairGridSearch, a novel framework for comparing fairness-enhancing models. FairGridSearch enables experimentation with different model parameter combinations and recommends the best one. The study applies FairGridSearch to three popular datasets (Adult, COMPAS, and German Credit) and analyzes the impacts of metric selection, base estimator choice, and classification threshold on model fairness. The results highlight the significance of selecting appropriate accuracy and fairness metrics for model evaluation. Additionally, different base estimators and classification threshold values affect the effectiveness of bias mitigation methods and fairness stability respectively, but the effects are not consistent across all datasets. Based on these findings, future research on fairness in machine learning should consider a broader range of factors when building fair models, going beyond bias mitigation methods alone.

翻译：机器学习模型越来越多地应用于关键决策场景。然而，这些模型容易复制甚至放大现实数据中存在的偏差。尽管文献中已有多种偏差缓解方法和基础估计器，但为特定应用选择最优模型仍具挑战性。本文聚焦二分类问题，提出了一种新颖的公平性增强模型比较框架——FairGridSearch。该框架支持对不同模型参数组合进行实验，并推荐最优方案。本研究将FairGridSearch应用于三个常用数据集（Adult、COMPAS和German Credit），分析了指标选择、基础估计器类型和分类阈值对模型公平性的影响。结果表明，选择恰当的准确率与公平性指标对模型评估至关重要。此外，不同基础估计器和分类阈值会分别影响偏差缓解方法的有效性及公平性稳定性，但跨数据集的影响效果并不一致。基于上述发现，未来机器学习公平性研究在构建公平模型时应考虑更广泛的因素，而不仅限于偏差缓解方法。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/