Recent works have demonstrated that neural networks exhibit extreme simplicity bias(SB). That is, they learn only the simplest features to solve a task at hand, even in the presence of other, more robust but more complex features. Due to the lack of a general and rigorous definition of features, these works showcase SB on semi-synthetic datasets such as Color-MNIST, MNIST-CIFAR where defining features is relatively easier. In this work, we rigorously define as well as thoroughly establish SB for one hidden layer neural networks. More concretely, (i) we define SB as the network essentially being a function of a low dimensional projection of the inputs (ii) theoretically, we show that when the data is linearly separable, the network primarily depends on only the linearly separable ($1$-dimensional) subspace even in the presence of an arbitrarily large number of other, more complex features which could have led to a significantly more robust classifier, (iii) empirically, we show that models trained on real datasets such as Imagenette and Waterbirds-Landbirds indeed depend on a low dimensional projection of the inputs, thereby demonstrating SB on these datasets, iv) finally, we present a natural ensemble approach that encourages diversity in models by training successive models on features not used by earlier models, and demonstrate that it yields models that are significantly more robust to Gaussian noise.
翻译:近期研究表明,神经网络表现出极端的简单性偏差(SB),即网络仅学习解决当前任务所需的最简单特征,即便存在其他更鲁棒但更复杂的特征也是如此。由于缺乏对特征普遍且严格的定义,现有工作主要在Color-MNIST、MNIST-CIFAR等半合成数据集上展示SB现象——这类数据集中特征定义相对容易。本文严格定义并系统论证了单隐层神经网络的简单性偏差。具体而言:(i)我们将SB定义为网络本质上作为输入低维投影的函数;(ii)理论上证明,当数据线性可分时,网络主要依赖线性可分(1维)子空间,即使存在大量可导致显著更鲁棒分类器的其他复杂特征;(iii)实验表明,在Imagenette和Waterbirds-Landbirds等真实数据集上训练的模型确实依赖输入的低维投影,从而验证了这些数据集上的SB现象;(iv)最后,我们提出一种自然集成方法,通过训练后续模型利用先前模型未使用的特征来促进模型多样性,实验证明该方法能生成对高斯噪声显著更鲁棒的模型。