As the use of machine learning models in real world high-stakes decision settings continues to grow, it is highly important that we are able to audit and control for any potential fairness violations these models may exhibit towards certain groups. To do so, one naturally requires access to sensitive attributes, such as demographics, gender, or other potentially sensitive features that determine group membership. Unfortunately, in many settings, this information is often unavailable. In this work we study the well known \emph{equalized odds} (EOD) definition of fairness. In a setting without sensitive attributes, we first provide tight and computable upper bounds for the EOD violation of a predictor. These bounds precisely reflect the worst possible EOD violation. Second, we demonstrate how one can provably control the worst-case EOD by a new post-processing correction method. Our results characterize when directly controlling for EOD with respect to the predicted sensitive attributes is -- and when is not -- optimal when it comes to controlling worst-case EOD. Our results hold under assumptions that are milder than previous works, and we illustrate these results with experiments on synthetic and real datasets.
翻译:随着机器学习模型在现实世界高风险决策场景中的持续应用,确保能够审计和管控这些模型可能对特定群体产生的公平性偏差变得至关重要。为此,通常需要访问敏感属性,例如人口统计数据、性别或其他决定群体成员身份的潜在敏感特征。然而在许多场景中,这类信息往往不可获取。本研究聚焦于著名的"平等几率"(EOD)公平性定义。在缺乏敏感属性的设置下,我们首先提出了预测器EOD违规的紧致可计算上界,这些上界精确反映了最坏情况下的EOD违规程度。其次,我们展示了如何通过一种新的后处理校正方法可证明地控制最坏情况EOD。研究结果揭示了在控制最坏情况EOD时,直接基于预测敏感属性进行EOD控制何时最优、何时非最优的判别条件。我们的结论建立在较以往研究更温和的假设之上,并通过合成数据集和真实数据集的实验进行了验证。