Towards Last-layer Retraining for Group Robustness with Fewer Annotations

Empirical risk minimization (ERM) of neural networks is prone to over-reliance on spurious correlations and poor generalization on minority groups. The recent deep feature reweighting (DFR) technique achieves state-of-the-art group robustness via simple last-layer retraining, but it requires held-out group and class annotations to construct a group-balanced reweighting dataset. In this work, we examine this impractical requirement and find that last-layer retraining can be surprisingly effective with no group annotations (other than for model selection) and only a handful of class annotations. We first show that last-layer retraining can greatly improve worst-group accuracy even when the reweighting dataset has only a small proportion of worst-group data. This implies a "free lunch" where holding out a subset of training data to retrain the last layer can substantially outperform ERM on the entire dataset with no additional data or annotations. To further improve group robustness, we introduce a lightweight method called selective last-layer finetuning (SELF), which constructs the reweighting dataset using misclassifications or disagreements. Our empirical and theoretical results present the first evidence that model disagreement upsamples worst-group data, enabling SELF to nearly match DFR on four well-established benchmarks across vision and language tasks with no group annotations and less than 3% of the held-out class annotations. Our code is available at https://github.com/tmlabonte/last-layer-retraining.

翻译：神经网络的经验风险最小化（ERM）容易过度依赖虚假相关性，导致在少数群体上泛化性能较差。近期提出的深度特征重加权（DFR）技术通过简单的末层重训练实现了最先进的群体鲁棒性，但该方法需要额外的群体和类别标注来构建群体均衡的重加权数据集。本研究审视了这一不切实际的需求，发现末层重训练在没有群体标注（除模型选择外）且仅需少量类别标注时仍能取得惊人效果。我们首先证明，即使重加权数据集中最差群体数据仅占很小比例，末层重训练仍能显著提升最差群体准确率。这揭示了“免费午餐”现象：通过保留训练数据子集重训练末层，性能可大幅超越使用全部数据且无需额外数据或标注的ERM方法。为进一步提升群体鲁棒性，我们提出轻量级方法——选择性末层微调（SELF），该方法利用误分类或分歧构建重加权数据集。实验与理论结果首次证明：模型分歧可对最差群体数据进行过采样，使SELF在视觉与语言任务的四个公认基准测试中，无需群体标注且仅需不到3%的保留类别标注即可接近DFR性能。我们的代码已开源至https://github.com/tmlabonte/last-layer-retraining。

相关内容

GROUP

关注 1

Group一直是研究计算机支持的合作工作、人机交互、计算机支持的协作学习和社会技术研究的主要场所。该会议将社会科学、计算机科学、工程、设计、价值观以及其他与小组工作相关的多个不同主题的工作结合起来，并进行了广泛的概念化。官网链接：https://group.acm.org/conferences/group20/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日