FITNESS: A Causal De-correlation Approach for Mitigating Bias in Machine Learning Software

Software built on top of machine learning algorithms is becoming increasingly prevalent in a variety of fields, including college admissions, healthcare, insurance, and justice. The effectiveness and efficiency of these systems heavily depend on the quality of the training datasets. Biased datasets can lead to unfair and potentially harmful outcomes, particularly in such critical decision-making systems where the allocation of resources may be affected. This can exacerbate discrimination against certain groups and cause significant social disruption. To mitigate such unfairness, a series of bias-mitigating methods are proposed. Generally, these studies improve the fairness of the trained models to a certain degree but with the expense of sacrificing the model performance. In this paper, we propose FITNESS, a bias mitigation approach via de-correlating the causal effects between sensitive features (e.g., the sex) and the label. Our key idea is that by de-correlating such effects from a causality perspective, the model would avoid making predictions based on sensitive features and thus fairness could be improved. Furthermore, FITNESS leverages multi-objective optimization to achieve a better performance-fairness trade-off. To evaluate the effectiveness, we compare FITNESS with 7 state-of-the-art methods in 8 benchmark tasks by multiple metrics. Results show that FITNESS can outperform the state-of-the-art methods on bias mitigation while preserve the model's performance: it improved the model's fairness under all the scenarios while decreased the model's performance under only 26.67% of the scenarios. Additionally, FITNESS surpasses the Fairea Baseline in 96.72% cases, outperforming all methods we compared.

翻译：基于机器学习算法构建的软件在众多领域日益普及，包括大学招生、医疗保健、保险和司法。这些系统的有效性与效率高度依赖于训练数据集的质量。有偏差的数据集可能导致不公平且可能有害的结果，尤其是在此类关键决策系统中——资源分配可能受到影响。这可能加剧对特定群体的歧视，并引发严重的社会问题。为缓解此类不公平性，一系列偏差缓解方法被提出。总体而言，这些研究在一定程度上提升了训练模型的公平性，但往往以牺牲模型性能为代价。本文提出FITNESS，一种通过去敏感特征（如性别）与标签之间因果效应的偏差缓解方法。我们的核心思想是：从因果关系视角去除此类效应，可使模型避免基于敏感特征进行预测，从而提升公平性。此外，FITNESS利用多目标优化实现性能与公平性之间更优的权衡。为评估有效性，我们在8个基准任务上采用多种指标，将FITNESS与7种前沿方法进行比较。结果表明，FITNESS在缓解偏差方面优于当前方法，同时保持了模型性能：在所有场景下均提升了公平性，仅在26.67%的场景下导致性能下降。此外，FITNESS在96.72%的案例中超越Fairea基线，性能优于我们比较的所有方法。