Empirical risk minimization (ERM) of neural networks is prone to over-reliance on spurious correlations and poor generalization on minority groups. The recent deep feature reweighting (DFR) technique achieves state-of-the-art group robustness via simple last-layer retraining, but it requires held-out group and class annotations to construct a group-balanced reweighting dataset. In this work, we examine this impractical requirement and find that last-layer retraining can be surprisingly effective with no group annotations (other than for model selection) and only a handful of class annotations. We first show that last-layer retraining can greatly improve worst-group accuracy even when the reweighting dataset has only a small proportion of worst-group data. This implies a "free lunch" where holding out a subset of training data to retrain the last layer can substantially outperform ERM on the entire dataset with no additional data or annotations. To further improve group robustness, we introduce a lightweight method called selective last-layer finetuning (SELF), which constructs the reweighting dataset using misclassifications or disagreements. Our empirical and theoretical results present the first evidence that model disagreement upsamples worst-group data, enabling SELF to nearly match DFR on four well-established benchmarks across vision and language tasks with no group annotations and less than 3% of the held-out class annotations. Our code is available at https://github.com/tmlabonte/last-layer-retraining.
翻译:摘要:神经网络的实证风险最小化(ERM)容易过度依赖虚假相关性,并导致在少数群体上的泛化能力较差。近年来提出的深度特征重加权(DFR)技术通过简单的末层再训练实现了最先进的组鲁棒性,但该方法需要保留的群体和类别标注来构建组平衡的重加权数据集。在本工作中,我们审视了这一不切实际的要求,发现即使没有群体标注(除模型选择外)且仅需少量类别标注,末层再训练仍能出人意料地有效。我们首先证明,即使重加权数据集中仅包含极小比例的欠拟合群体数据,末层再训练仍能显著提升最差组准确率。这意味着存在一个"免费午餐":保留部分训练数据来重新训练末层,可以在不增加额外数据或标注的情况下,在完整数据集上显著优于ERM。为进一步提升组鲁棒性,我们提出一种轻量级方法——选择性末层微调(SELF),该方法利用分类错误或分歧来构建重加权数据集。我们的实验与理论结果首次证明:模型分歧能对欠拟合群体数据进行上采样,从而使SELF在无需群体标注且保留类别标注少于3%的情况下,在视觉与语言任务的四个权威基准上近乎达到DFR的性能。我们的代码发布于https://github.com/tmlabonte/last-layer-retraining。