Biases in the dataset often enable the model to achieve high performance on in-distribution data, while poorly performing on out-of-distribution data. To mitigate the detrimental effect of the bias on the networks, previous works have proposed debiasing methods that down-weight the biased examples identified by an auxiliary model, which is trained with explicit bias labels. However, finding a type of bias in datasets is a costly process. Therefore, recent studies have attempted to make the auxiliary model biased without the guidance (or annotation) of bias labels, by constraining the model's training environment or the capability of the model itself. Despite the promising debiasing results of recent works, the multi-class learning objective, which has been naively used to train the auxiliary model, may harm the bias mitigation effect due to its regularization effect and competitive nature across classes. As an alternative, we propose a new debiasing framework that introduces binary classifiers between the auxiliary model and the main model, coined bias experts. Specifically, each bias expert is trained on a binary classification task derived from the multi-class classification task via the One-vs-Rest approach. Experimental results demonstrate that our proposed strategy improves the bias identification ability of the auxiliary model. Consequently, our debiased model consistently outperforms the state-of-the-art on various challenge datasets.
翻译:数据集中的偏见常导致模型在分布内数据上表现优异,但在分布外数据上性能欠佳。为减轻偏见对网络的负面影响,已有研究提出通过显式偏见标签训练的辅助模型识别并降低有偏样本权重。然而,在数据集中发掘偏见类型成本高昂,因此近年研究尝试通过约束模型训练环境或模型自身能力,在不依赖偏见标签指导(或标注)的情况下使辅助模型产生偏见偏好。尽管近期去偏方法取得了令人瞩目的成果,但用于训练辅助模型的多分类学习目标因其正则化效应和类间竞争特性,可能削弱偏见缓解效果。为此,我们提出一种新的去偏框架,在辅助模型与主模型之间引入二元分类器,称之为偏见专家(bias experts)。具体而言,每个偏见专家基于One-vs-Rest策略从多分类任务推导的二元分类任务进行训练。实验结果表明,所提策略有效提升了辅助模型的偏见识别能力,进而使去偏模型在多个挑战性数据集上持续优于现有最先进方法。