Empirical risk minimization (ERM) of neural networks is prone to over-reliance on spurious correlations and poor generalization on minority groups. The recent deep feature reweighting (DFR) technique achieves state-of-the-art group robustness via simple last-layer retraining, but it requires held-out group and class annotations to construct a group-balanced reweighting dataset. In this work, we examine this impractical requirement and find that last-layer retraining can be surprisingly effective with no group annotations (other than for model selection) and only a handful of class annotations. We first show that last-layer retraining can greatly improve worst-group accuracy even when the reweighting dataset has only a small proportion of worst-group data. This implies a "free lunch" where holding out a subset of training data to retrain the last layer can substantially outperform ERM on the entire dataset with no additional data or annotations. To further improve group robustness, we introduce a lightweight method called selective last-layer finetuning (SELF), which constructs the reweighting dataset using misclassifications or disagreements. Our empirical and theoretical results present the first evidence that model disagreement upsamples worst-group data, enabling SELF to nearly match DFR on four well-established benchmarks across vision and language tasks with no group annotations and less than 3% of the held-out class annotations. Our code is available at https://github.com/tmlabonte/last-layer-retraining.
翻译:神经网络的经验风险最小化(ERM)容易过度依赖虚假相关性,导致在少数群体上泛化性能较差。近期提出的深度特征重加权(DFR)技术通过简单的末层重训练实现了最先进的群体鲁棒性,但该方法需要额外的群体和类别标注来构建群体均衡的重加权数据集。本研究审视了这一不切实际的需求,发现末层重训练在没有群体标注(除模型选择外)且仅需少量类别标注时仍能取得惊人效果。我们首先证明,即使重加权数据集中最差群体数据仅占很小比例,末层重训练仍能显著提升最差群体准确率。这揭示了“免费午餐”现象:通过保留训练数据子集重训练末层,性能可大幅超越使用全部数据且无需额外数据或标注的ERM方法。为进一步提升群体鲁棒性,我们提出轻量级方法——选择性末层微调(SELF),该方法利用误分类或分歧构建重加权数据集。实验与理论结果首次证明:模型分歧可对最差群体数据进行过采样,使SELF在视觉与语言任务的四个公认基准测试中,无需群体标注且仅需不到3%的保留类别标注即可接近DFR性能。我们的代码已开源至https://github.com/tmlabonte/last-layer-retraining。