Bias Mimicking: A Simple Sampling Approach for Bias Mitigation

Prior work has shown that Visual Recognition datasets frequently underrepresent bias groups $B$ (\eg Female) within class labels $Y$ (\eg Programmers). This dataset bias can lead to models that learn spurious correlations between class labels and bias groups such as age, gender, or race. Most recent methods that address this problem require significant architectural changes or additional loss functions requiring more hyper-parameter tuning. Alternatively, data sampling baselines from the class imbalance literature (\eg Undersampling, Upweighting), which can often be implemented in a single line of code and often have no hyperparameters, offer a cheaper and more efficient solution. However, these methods suffer from significant shortcomings. For example, Undersampling drops a significant part of the input distribution per epoch while Oversampling repeats samples, causing overfitting. To address these shortcomings, we introduce a new class-conditioned sampling method: Bias Mimicking. The method is based on the observation that if a class $c$ bias distribution, \ie $P_D(B|Y=c)$ is mimicked across every $c^{\prime}\neq c$, then $Y$ and $B$ are statistically independent. Using this notion, BM, through a novel training procedure, ensures that the model is exposed to the entire distribution per epoch without repeating samples. Consequently, Bias Mimicking improves underrepresented groups' accuracy of sampling methods by 3\% over four benchmarks while maintaining and sometimes improving performance over nonsampling methods. Code: \url{https://github.com/mqraitem/Bias-Mimicking}

翻译：先前研究表明，视觉识别数据集中，类别标签$Y$（例如程序员）内常存在偏差群体$B$（例如女性）代表性不足的问题。这种数据集偏差会导致模型学习到类别标签与年龄、性别或种族等偏差群体之间的虚假关联。当前多数解决方案需要显著修改模型架构或添加额外损失函数，从而增加超参数调优成本。相比之下，类别不平衡文献中的基础数据采样方法（如欠采样、权重提升）通常用单行代码即可实现且无需超参数，提供了更廉价高效的解决方案。然而，这些方法存在明显缺陷：欠采样会丢弃每轮训练中大部分输入分布样本，而过采样则通过重复样本导致过拟合。为解决这些问题，我们提出了一种新的类别条件采样方法——偏见模仿。该方法基于以下观察：若类别$c$的偏差分布$P_D(B|Y=c)$被其他所有类别$c^{\prime}\neq c$模仿，则$Y$与$B$统计独立。基于此原理，BM通过新颖的训练流程确保模型在每个训练周期内接触完整数据分布而不重复样本。实验表明，偏见模仿在四个基准测试中将采样方法对少数群体准确率提升3%，同时保持甚至超越非采样方法的性能。代码地址：\url{https://github.com/mqraitem/Bias-Mimicking}