As the frontier of machine learning applications moves further into human interaction, multiple concerns arise regarding automated decision-making. Two of the most critical issues are fairness and data privacy. On the one hand, one must guarantee that automated decisions are not biased against certain groups, especially those unprotected or marginalized. On the other hand, one must ensure that the use of personal information fully abides by privacy regulations and that user identities are kept safe. The balance between privacy, fairness, and predictive performance is complex. However, despite their potential societal impact, we still demonstrate a poor understanding of the dynamics between these optimization vectors. In this paper, we study this three-way tension and how the optimization of each vector impacts others, aiming to inform the future development of safe applications. In light of claims that predictive performance and fairness can be jointly optimized, we find this is only possible at the expense of data privacy. Overall, experimental results show that one of the vectors will be penalized regardless of which of the three we optimize. Nonetheless, we find promising avenues for future work in joint optimization solutions, where smaller trade-offs are observed between the three vectors.
翻译:随着机器学习应用的前沿进一步深入人类互动,自动化决策引发了多重关切。其中两个关键问题分别是公平性与数据隐私。一方面,必须确保自动化决策不会对特定群体(尤其是未受保护或边缘化群体)产生偏见;另一方面,必须保证个人信息的使用完全遵守隐私法规,并确保用户身份安全。隐私、公平与预测性能之间的平衡错综复杂。然而,尽管这些优化向量具有潜在的社会影响,我们对其动态关系的理解仍十分有限。本文研究这三者之间的张力,以及各向量的优化如何影响其他向量,旨在为未来安全应用的开发提供指导。针对预测性能与公平性可协同优化的论断,我们发现这只能以牺牲数据隐私为代价。总体而言,实验结果表明,无论优化三者中的哪一个向量,其余向量都会受到惩罚。尽管如此,我们在联合优化方案中发现了有前景的研究方向,即三者之间的权衡较小。