Fairness in predictions is of direct importance in practice due to legal, ethical, and societal reasons. It is often achieved through counterfactual fairness, which ensures that the prediction for an individual is the same as that in a counterfactual world under a different sensitive attribute. However, achieving counterfactual fairness is challenging as counterfactuals are unobservable. In this paper, we develop a novel deep neural network called Generative Counterfactual Fairness Network (GCFN) for making predictions under counterfactual fairness. Specifically, we leverage a tailored generative adversarial network to directly learn the counterfactual distribution of the descendants of the sensitive attribute, which we then use to enforce fair predictions through a novel counterfactual mediator regularization. If the counterfactual distribution is learned sufficiently well, our method is mathematically guaranteed to ensure the notion of counterfactual fairness. Thereby, our GCFN addresses key shortcomings of existing baselines that are based on inferring latent variables, yet which (a) are potentially correlated with the sensitive attributes and thus lead to bias, and (b) have weak capability in constructing latent representations and thus low prediction performance. Across various experiments, our method achieves state-of-the-art performance. Using a real-world case study from recidivism prediction, we further demonstrate that our method makes meaningful predictions in practice.
翻译:预测中的公平性因法律、伦理和社会原因在实践中具有直接重要性。该目标通常通过反事实公平性实现,即确保个体预测结果与其在反事实世界中不同敏感属性下的预测结果一致。然而,由于反事实结果不可观测,实现反事实公平性极具挑战性。本文提出一种名为生成式反事实公平网络(GCFN)的新型深度神经网络,用于在反事实公平性约束下进行预测。具体而言,我们利用定制化生成对抗网络直接学习敏感属性后代的分布,并通过新颖的反事实中介正则化强制实现公平预测。若反事实分布学习充分,我们的方法在数学上可保证满足反事实公平性概念。该GCFN方法解决了现有基于隐变量推断的基线方法的关键缺陷:这些方法(a)推断的隐变量可能与敏感属性存在潜在关联导致偏差,(b)构建隐表示的能力较弱导致预测性能低下。通过多项实验,我们的方法实现了最先进性能。基于累犯预测的真实案例研究进一步证明,该方法在实践中能生成有意义的预测。