Decision-making processes have increasingly come to rely on sophisticated machine learning tools, raising concerns about the fairness of their predictions with respect to any sensitive groups. The widespread use of commercial black-box machine learning models necessitates careful consideration of their legal and ethical implications on consumers. In situations where users have access to these "black-box" models, a key question emerges: how can we mitigate or eliminate the influence of sensitive attributes, such as race or gender? We propose towerDebias (tDB), a novel approach designed to reduce the influence of sensitive variables in predictions made by black-box models. Using the Tower Property from probability theory, tDB aims to improve prediction fairness during the post-processing stage in a manner amenable to the Fairness-Utility Tradeoff. This method is highly flexible, requiring no prior knowledge of the original model's internal structure, and can be extended to a range of different applications. We provide a formal improvement theorem for tDB and demonstrate its effectiveness in both regression and classification tasks, underscoring its impact on the fairness-utility tradeoff.
翻译:决策过程日益依赖于复杂的机器学习工具,这引发了人们对其预测相对于任何敏感群体是否公平的担忧。商业黑盒机器学习模型的广泛使用,要求我们审慎考虑其对消费者的法律与伦理影响。在用户能够访问这些“黑盒”模型的情况下,一个关键问题浮现:我们如何减轻或消除敏感属性(如种族或性别)的影响?我们提出了塔式去偏(tDB),这是一种旨在减少黑盒模型预测中敏感变量影响的新方法。利用概率论中的塔性质,tDB旨在后处理阶段以符合公平-效用权衡的方式提升预测的公平性。该方法具有高度灵活性,无需预先了解原始模型的内部结构,并可扩展到一系列不同的应用场景。我们为tDB提供了一个正式的改进定理,并在回归与分类任务中证明了其有效性,突显了其对公平-效用权衡的影响。