There has been an increasing interest in enhancing the fairness of machine learning (ML). Despite the growing number of fairness-improving methods, we lack a systematic understanding of the trade-offs among factors considered in the ML pipeline when fairness-improving methods are applied. This understanding is essential for developers to make informed decisions regarding the provision of fair ML services. Nonetheless, it is extremely difficult to analyze the trade-offs when there are multiple fairness parameters and other crucial metrics involved, coupled, and even in conflict with one another. This paper uses causality analysis as a principled method for analyzing trade-offs between fairness parameters and other crucial metrics in ML pipelines. To ractically and effectively conduct causality analysis, we propose a set of domain-specific optimizations to facilitate accurate causal discovery and a unified, novel interface for trade-off analysis based on well-established causal inference methods. We conduct a comprehensive empirical study using three real-world datasets on a collection of widelyused fairness-improving techniques. Our study obtains actionable suggestions for users and developers of fair ML. We further demonstrate the versatile usage of our approach in selecting the optimal fairness-improving method, paving the way for more ethical and socially responsible AI technologies.
翻译:近年来,提升机器学习公平性的研究日益受到关注。尽管涌现出大量公平性改进方法,但我们对应用这些方法时机器学习流程中各因素间的权衡关系仍缺乏系统性理解。这种理解对于开发者在提供公平机器学习服务时做出明智决策至关重要。然而,当涉及多个公平性参数与其他关键指标相互耦合甚至相互冲突时,分析这些权衡关系极具挑战性。本文采用因果分析作为原则性方法,用于分析机器学习流程中公平性参数与其他关键指标之间的权衡关系。为切实有效地进行因果分析,我们提出了一套领域特定优化策略以促进准确的因果发现,并基于成熟的因果推断方法构建了一个统一的新型权衡分析接口。我们利用三个真实世界数据集,对一系列广泛使用的公平性改进技术进行了全面的实证研究。研究结果为公平机器学习的用户和开发者获得了可操作的建议。我们进一步展示了该方法在选择最优公平性改进技术方面的灵活应用,为开发更具伦理性和社会责任性的人工智能技术铺平道路。