Models trained with empirical risk minimization (ERM) are prone to be biased towards spurious correlations between target labels and bias attributes, which leads to poor performance on data groups lacking spurious correlations. It is particularly challenging to address this problem when access to bias labels is not permitted. To mitigate the effect of spurious correlations without bias labels, we first introduce a novel training objective designed to robustly enhance model performance across all data samples, irrespective of the presence of spurious correlations. From this objective, we then derive a debiasing method, Disagreement Probability based Resampling for debiasing (DPR), which does not require bias labels. DPR leverages the disagreement between the target label and the prediction of a biased model to identify bias-conflicting samples-those without spurious correlations-and upsamples them according to the disagreement probability. Empirical evaluations on multiple benchmarks demonstrate that DPR achieves state-of-the-art performance over existing baselines that do not use bias labels. Furthermore, we provide a theoretical analysis that details how DPR reduces dependency on spurious correlations.
翻译:基于经验风险最小化(ERM)训练的模型容易偏向于目标标签与偏差属性之间的伪相关性,这导致在缺乏伪相关性的数据组上性能不佳。当无法获取偏差标签时,解决这一问题尤为困难。为了在没有偏差标签的情况下缓解伪相关性的影响,我们首先引入了一种新颖的训练目标,该目标旨在稳健地提升模型在所有数据样本上的性能,而不受伪相关性存在与否的影响。基于此目标,我们进一步推导出一种无需偏差标签的去偏方法——基于分歧概率的重采样去偏法(DPR)。DPR利用目标标签与偏置模型预测之间的分歧来识别无伪相关性的偏差冲突样本,并依据分歧概率对其进行上采样。在多个基准数据集上的实证评估表明,DPR在不使用偏差标签的现有基线方法中达到了最先进的性能。此外,我们提供了理论分析,详细阐述了DPR如何降低对伪相关性的依赖。