Learned classifiers should often possess certain invariance properties meant to encourage fairness, robustness, or out-of-distribution generalization. However, multiple recent works empirically demonstrate that common invariance-inducing regularizers are ineffective in the over-parameterized regime, in which classifiers perfectly fit (i.e. interpolate) the training data. This suggests that the phenomenon of "benign overfitting", in which models generalize well despite interpolating, might not favorably extend to settings in which robustness or fairness are desirable. In this work we provide a theoretical justification for these observations. We prove that -- even in the simplest of settings -- any interpolating learning rule (with arbitrarily small margin) will not satisfy these invariance properties. We then propose and analyze an algorithm that -- in the same setting -- successfully learns a non-interpolating classifier that is provably invariant. We validate our theoretical observations on simulated data and the Waterbirds dataset.
翻译:学习得到的分类器通常应具备某些不变性特性,以促进公平性、鲁棒性或分布外泛化能力。然而,近期多项研究通过实验证明,在过参数化机制下(即分类器完美拟合(即插值)训练数据时),常见的不变性诱导正则化方法效果甚微。这表明,尽管模型在插值情况下仍能良好泛化的“良性过拟合”现象,可能无法顺利延伸至需要鲁棒性或公平性的场景。本工作为这些观察结果提供了理论依据。我们证明——即使在最简单的设定中——任何插值学习规则(即使具有任意小的边界)都无法满足这些不变性特性。随后,我们提出并分析了一种算法——在相同设定下——该算法能够成功学习一个非插值分类器,且该分类器被证明具有不变性。我们在模拟数据和Waterbirds数据集上验证了理论观察结果。