Counterfactual fairness is an approach to AI fairness that tries to make decisions based on the outcomes that an individual with some kind of sensitive status would have had without this status. This paper proposes Double Machine Learning (DML) Fairness which analogises this problem of counterfactual fairness in regression problems to that of estimating counterfactual outcomes in causal inference under the Potential Outcomes framework. It uses arbitrary machine learning methods to partial out the effect of sensitive variables on nonsensitive variables and outcomes. Assuming that the effects of the two sets of variables are additively separable, outcomes will be approximately equalised and individual-level outcomes will be counterfactually fair. This paper demonstrates the approach in a simulation study pertaining to discrimination in workplace hiring and an application on real data estimating the GPAs of law school students. It then discusses when it is appropriate to apply such a method to problems of real-world discrimination where constructs are conceptually complex and finally, whether DML Fairness can achieve justice in these settings.
翻译:反事实公平是一种旨在基于敏感个体在无敏感状态下可能获得的结果进行决策的AI公平方法。本文提出双机器学习(DML)公平性方法,将回归问题中的反事实公平问题类比为潜在结果框架下因果推断中的反事实结果估计问题。该方法利用任意机器学习技术,部分消除敏感变量对非敏感变量和结果变量的影响。假设两组变量的效应具有加法可分离性,则结果将近似均衡,且个体层面的结果将满足反事实公平性。本文通过职场招聘歧视的模拟研究,以及基于真实数据评估法学院学生绩点的实际应用,对该方法进行了验证。随后讨论了在哪些场景下该方法适用于解决现实中概念复杂的歧视问题,并最终探讨DML公平性是否能在这些情境中实现真正的公平正义。