Fairness for machine learning predictions is widely required in practice for legal, ethical, and societal reasons. Existing work typically focuses on settings without unobserved confounding, even though unobserved confounding can lead to severe violations of causal fairness and, thus, unfair predictions. In this work, we analyze the sensitivity of causal fairness to unobserved confounding. Our contributions are three-fold. First, we derive bounds for causal fairness metrics under different sources of unobserved confounding. This enables practitioners to examine the sensitivity of their machine learning models to unobserved confounding in fairness-critical applications. Second, we propose a novel neural framework for learning fair predictions, which allows us to offer worst-case guarantees of the extent to which causal fairness can be violated due to unobserved confounding. Third, we demonstrate the effectiveness of our framework in a series of experiments, including a real-world case study about predicting prison sentences. To the best of our knowledge, ours is the first work to study causal fairness under unobserved confounding. To this end, our work is of direct practical value as a refutation strategy to ensure the fairness of predictions in high-stakes applications.
翻译:机器学习预测的公平性因法律、伦理和社会原因在实践中被广泛要求。现有工作通常关注无未观测混杂的情形,尽管未观测混杂可能导致因果公平性严重违反,从而产生不公平的预测。本研究分析了因果公平性对未观测混杂的灵敏度。我们的贡献包括三个方面。首先,我们推导了不同未观测混杂源下的因果公平性指标界限,使从业者能够在公平性关键应用中检验机器学习模型对未观测混杂的灵敏度。其次,我们提出了一种用于学习公平预测的新型神经框架,该框架能够提供关于未观测混杂可能导致因果公平性被违反程度的悲观情境保证。第三,我们通过一系列实验(包括一个关于预测刑期的真实世界案例研究)展示了该框架的有效性。据我们所知,这是首个研究未观测混杂下因果公平性的工作。因此,本研究作为确保高风险应用中预测公平性的反驳策略具有直接的实践价值。