Providing formal guarantees of algorithmic fairness is of paramount importance to socially responsible deployment of machine learning algorithms. In this work, we study formal guarantees, i.e., certificates, for individual fairness (IF) of neural networks. We start by introducing a novel convex approximation of IF constraints that exponentially decreases the computational cost of providing formal guarantees of local individual fairness. We highlight that prior methods are constrained by their focus on global IF certification and can therefore only scale to models with a few dozen hidden neurons, thus limiting their practical impact. We propose to certify distributional individual fairness which ensures that for a given empirical distribution and all distributions within a $\gamma$-Wasserstein ball, the neural network has guaranteed individually fair predictions. Leveraging developments in quasi-convex optimization, we provide novel and efficient certified bounds on distributional individual fairness and show that our method allows us to certify and regularize neural networks that are several orders of magnitude larger than those considered by prior works. Moreover, we study real-world distribution shifts and find our bounds to be a scalable, practical, and sound source of IF guarantees.
翻译:为机器学习算法提供公平性的形式化保证对于算法的社会责任部署至关重要。本文研究了神经网络个体公平性的形式化保证(即证书)。我们首先提出一种新颖的IF约束凸近似方法,该方法将提供局部个体公平性形式化保证的计算成本呈指数级降低。我们强调,现有方法受限于全局IF认证的聚焦,因此仅能扩展至具有数十个隐藏神经元的模型,从而限制了其实用价值。我们提出对分布性个体公平性进行认证——该公平性确保对给定经验分布及其γ-瓦瑟斯坦球内的所有分布,神经网络的预测具有保证的个体公平性。借助拟凸优化的最新进展,我们提出了关于分布性个体公平性的新颖且高效的认证界,并证明我们的方法能够对数倍于现有工作规模的大型神经网络进行认证与正则化。此外,我们研究了真实世界的分布偏移,发现我们的界是IF保证的可扩展、实用且可靠的来源。