This paper investigates neuron dropout as a post-processing bias mitigation for deep neural networks (DNNs). Neural-driven software solutions are increasingly applied in socially critical domains with significant fairness implications. While neural networks are exceptionally good at finding statistical patterns from data, they may encode and amplify existing biases from the historical data. Existing bias mitigation algorithms often require modifying the input dataset or the learning algorithms. We posit that the prevalent dropout methods that prevent over-fitting during training by randomly dropping neurons may be an effective and less intrusive approach to improve the fairness of pre-trained DNNs. However, finding the ideal set of neurons to drop is a combinatorial problem. We propose NeuFair, a family of post-processing randomized algorithms that mitigate unfairness in pre-trained DNNs via dropouts during inference after training. Our randomized search is guided by an objective to minimize discrimination while maintaining the model's utility. We show that our design of randomized algorithms is effective and efficient in improving fairness (up to 69%) with minimal or no model performance degradation. We provide intuitive explanations of these phenomena and carefully examine the influence of various hyperparameters of search algorithms on the results. Finally, we empirically and conceptually compare NeuFair to different state-of-the-art bias mitigators.
翻译:本文研究了神经元丢弃作为深度神经网络(DNN)后处理偏差缓解方法。神经驱动的软件解决方案日益应用于具有重大公平性影响的社会关键领域。虽然神经网络在从数据中发现统计模式方面表现出色,但它们可能编码并放大历史数据中存在的既有偏差。现有的偏差缓解算法通常需要修改输入数据集或学习算法。我们认为,通过在训练期间随机丢弃神经元以防止过拟合的普遍丢弃方法,可能是一种有效且侵入性较小的改进预训练DNN公平性的途径。然而,找到理想的神经元丢弃集合是一个组合优化问题。我们提出了NeuFair,一系列后处理随机算法,通过在训练后的推理阶段进行神经元丢弃,来缓解预训练DNN中的不公平性。我们的随机搜索以最小化歧视同时保持模型效用为目标进行引导。我们证明了我们设计的随机算法在提升公平性(最高达69%)方面是有效且高效的,同时带来最小化或无模型性能下降。我们为这些现象提供了直观解释,并仔细检验了搜索算法各种超参数对结果的影响。最后,我们在实证和概念上将NeuFair与不同的最先进偏差缓解方法进行了比较。