Learning a fair predictive model is crucial to mitigate biased decisions against minority groups in high-stakes applications. A common approach to learn such a model involves solving an optimization problem that maximizes the predictive power of the model under an appropriate group fairness constraint. However, in practice, sensitive attributes are often missing or noisy resulting in uncertainty. We demonstrate that solely enforcing fairness constraints on uncertain sensitive attributes can fall significantly short in achieving the level of fairness of models trained without uncertainty. To overcome this limitation, we propose a bootstrap-based algorithm that achieves the target level of fairness despite the uncertainty in sensitive attributes. The algorithm is guided by a Gaussian analysis for the independence notion of fairness where we propose a robust quadratically constrained quadratic problem to ensure a strict fairness guarantee with uncertain sensitive attributes. Our algorithm is applicable to both discrete and continuous sensitive attributes and is effective in real-world classification and regression tasks for various group fairness notions, e.g., independence and separation.
翻译:在高风险应用中,学习公平预测模型对于减轻对少数群体的偏见决策至关重要。一种常见的学习此类模型的方法涉及求解一个优化问题,即在适当的群体公平性约束下最大化模型的预测能力。然而,在实际应用中,敏感属性往往缺失或存在噪声,从而导致不确定性。我们证明,仅对不确定的敏感属性施加公平性约束,其实现公平性的程度可能远低于在不含不确定性的情况下训练的模型。为克服这一局限性,我们提出了一种基于自举法的算法,该算法能在敏感属性存在不确定性的情况下实现目标公平性水平。该算法以高斯分析为指导,针对公平性的独立性概念,提出了一种鲁棒的二次约束二次规划问题,以确保在敏感属性不确定时提供严格的公平性保证。我们的算法适用于离散和连续敏感属性,并能有效处理现实分类与回归任务中的多种群体公平性概念,例如独立性和分离性。