The responsible use of machine learning tools in real world high-stakes decision making demands that we audit and control for potential biases against underrepresented groups. This process naturally requires access to the sensitive attribute one desires to control, such as demographics, gender, or other potentially sensitive features. Unfortunately, this information is often unavailable. In this work we demonstrate that one can still reliably estimate, and ultimately control, for fairness by using proxy sensitive attributes derived from a sensitive attribute predictor. Specifically, we first show that with just a little knowledge of the complete data distribution, one may use a sensitive attribute predictor to obtain bounds of the classifier's true fairness metric. Second, we demonstrate how one can provably control a classifier's worst-case fairness violation with respect to the true sensitive attribute by controlling for fairness with respect to the proxy sensitive attribute. Our results hold under assumptions that are significantly milder than previous works, and we illustrate these results with experiments on synthetic and real datasets.
翻译:在现实世界高风险决策中负责任地使用机器学习工具要求我们审计和控制对弱势群体潜在的偏见。这一过程自然需要获取人们希望控制的敏感属性,例如人口统计信息、性别或其他潜在的敏感特征。不幸的是,这些信息往往不可用。在本文中,我们证明可以通过使用来自敏感属性预测器的代理敏感属性来可靠地估计并最终控制公平性。具体而言,我们首先表明,仅需对完整数据分布有少量了解,就可以利用敏感属性预测器获得分类器真实公平性指标的界限。其次,我们展示如何通过控制相对于代理敏感属性的公平性,来可证明地控制分类器相对于真实敏感属性的最坏情况公平违反。我们的结果在比先前工作显著更温和的假设下成立,并通过在合成和真实数据集上的实验说明了这些结果。