Models trained with empirical risk minimization (ERM) are prone to be biased towards spurious correlations between target labels and bias attributes, which leads to poor performance on data groups lacking spurious correlations. It is particularly challenging to address this problem when access to bias labels is not permitted. To mitigate the effect of spurious correlations without bias labels, we first introduce a novel training objective designed to robustly enhance model performance across all data samples, irrespective of the presence of spurious correlations. From this objective, we then derive a debiasing method, Disagreement Probability based Resampling for debiasing (DPR), which does not require bias labels. DPR leverages the disagreement between the target label and the prediction of a biased model to identify bias-conflicting samples-those without spurious correlations-and upsamples them according to the disagreement probability. Empirical evaluations on multiple benchmarks demonstrate that DPR achieves state-of-the-art performance over existing baselines that do not use bias labels. Furthermore, we provide a theoretical analysis that details how DPR reduces dependency on spurious correlations.
翻译:采用经验风险最小化(ERM)训练的模型容易偏向于目标标签与偏差属性之间的虚假相关性,导致在缺乏虚假相关性的数据组上表现不佳。当无法获取偏差标签时,解决这一问题尤为困难。为了在无偏差标签的情况下减轻虚假相关性的影响,我们首先提出了一种新颖的训练目标,旨在鲁棒地提升模型在所有数据样本上的性能,无论是否存在虚假相关性。基于此目标,我们进而推导出一种无需偏差标签的去偏方法——基于分歧概率的重采样去偏法(DPR)。DPR利用目标标签与偏置模型预测之间的分歧来识别偏差冲突样本(即不存在虚假相关性的样本),并根据分歧概率对其进行上采样。在多个基准测试上的实证评估表明,DPR的性能优于现有不使用偏差标签的基线方法,达到了当前最优水平。此外,我们提供了理论分析,详细阐述了DPR如何降低对虚假相关性的依赖。