A major challenge to out-of-distribution generalization is reliance on spurious features -- patterns that are predictive of the class label in the training data distribution, but not causally related to the target. Standard methods for reducing the reliance on spurious features typically assume that we know what the spurious feature is, which is rarely true in the real world. Methods that attempt to alleviate this limitation are complex, hard to tune, and lead to a significant computational overhead compared to standard training. In this paper, we propose Automatic Feature Reweighting (AFR), an extremely simple and fast method for updating the model to reduce the reliance on spurious features. AFR retrains the last layer of a standard ERM-trained base model with a weighted loss that emphasizes the examples where the ERM model predicts poorly, automatically upweighting the minority group without group labels. With this simple procedure, we improve upon the best reported results among competing methods trained without spurious attributes on several vision and natural language classification benchmarks, using only a fraction of their compute.
翻译:分布外泛化的主要挑战在于对虚假特征的依赖——这些模式在训练数据分布中与类别标签具有预测性关联,但与目标并无因果关系。减少对虚假特征依赖的标准方法通常假设我们知道虚假特征是什么,这在现实世界中很少成立。试图缓解这一局限的方法不仅复杂、难以调参,且相比标准训练会带来显著的计算开销。本文提出自动特征重加权(AFR),这是一种极其简单快速的方法,用于更新模型以减少对虚假特征的依赖。AFR通过对标准ERM训练的基模型最后一层进行重训练,采用加权损失函数来强调ERM模型预测效果较差的样本,无需群体标签即可自动提升少数群体的权重。通过这一简单流程,我们在多个视觉和自然语言分类基准测试中,超越了无虚假属性标注的同类方法的最佳报告结果,且仅使用了这些方法一小部分计算资源。