This project explores adversarial training techniques to develop fairer Deep Neural Networks (DNNs) to mitigate the inherent bias they are known to exhibit. DNNs are susceptible to inheriting bias with respect to sensitive attributes such as race and gender, which can lead to life-altering outcomes (e.g., demographic bias in facial recognition software used to arrest a suspect). We propose a robust optimization problem, which we demonstrate can improve fairness in several datasets, both synthetic and real-world, using an affine linear model. Leveraging second order information, we are able to find a solution to our optimization problem more efficiently than a purely first order method.
翻译:本项目探索对抗训练技术,以开发更公平的深度神经网络(DNN),从而减轻其已知的固有偏差。DNN容易在种族和性别等敏感属性上继承偏差,这可能导致改变人生的后果(例如,用于逮捕嫌疑人的面部识别软件中的群体偏差)。我们提出一个鲁棒优化问题,并证明通过仿射线性模型,该问题能在多个合成和真实数据集上提升公平性。利用二阶信息,我们能够比纯粹的一阶方法更高效地找到优化问题的解。