Deep neural networks (DNNs) are often prone to learn the spurious correlations between target classes and bias attributes, like gender and race, inherent in a major portion of training data (bias-aligned samples), thus showing unfair behavior and arising controversy in the modern pluralistic and egalitarian society. In this paper, we propose a novel marginal debiased network (MDN) to learn debiased representations. More specifically, a marginal softmax loss (MSL) is designed by introducing the idea of margin penalty into the fairness problem, which assigns a larger margin for bias-conflicting samples (data without spurious correlations) than for bias-aligned ones, so as to deemphasize the spurious correlations and improve generalization on unbiased test criteria. To determine the margins, our MDN is optimized through a meta learning framework. We propose a meta equalized loss (MEL) to perceive the model fairness, and adaptively update the margin parameters by metaoptimization which requires the trained model guided by the optimal margins should minimize MEL computed on an unbiased meta-validation set. Extensive experiments on BiasedMNIST, Corrupted CIFAR-10, CelebA and UTK-Face datasets demonstrate that our MDN can achieve a remarkable performance on under-represented samples and obtain superior debiased results against the previous approaches.
翻译:深度神经网络(DNNs)常倾向于学习目标类别与训练数据主体部分(即偏置对齐样本)中固有偏置属性(如性别、种族)之间的虚假关联,从而在现代多元平等社会中引发争议性不公平行为。本文提出一种新型边际去偏网络(MDN)以学习去偏表示。具体而言,通过将边际惩罚思想引入公平性问题,设计了边际Softmax损失(MSL):为偏置冲突样本(无虚假关联的数据)分配更大边际值,而对偏置对齐样本分配较小边际值,从而削弱虚假关联影响并提升无偏测试标准下的泛化性能。为确定边际值,我们的MDN通过元学习框架优化:提出元均衡损失(MEL)感知模型公平性,并通过元优化自适应更新边际参数——该优化要求由最优边际指导训练的模型,在无偏元验证集上最小化MEL。在BiasedMNIST、Corrupted CIFAR-10、CelebA与UTK-Face数据集上的大量实验表明,相比现有方法,我们的MDN在低表示样本上取得显著性能,并获得更优的去偏结果。