Recent studies have noted an intriguing phenomenon termed Neural Collapse, that is, when the neural networks establish the right correlation between feature spaces and the training targets, their last-layer features, together with the classifier weights, will collapse into a stable and symmetric structure. In this paper, we extend the investigation of Neural Collapse to the biased datasets with imbalanced attributes. We observe that models will easily fall into the pitfall of shortcut learning and form a biased, non-collapsed feature space at the early period of training, which is hard to reverse and limits the generalization capability. To tackle the root cause of biased classification, we follow the recent inspiration of prime training, and propose an avoid-shortcut learning framework without additional training complexity. With well-designed shortcut primes based on Neural Collapse structure, the models are encouraged to skip the pursuit of simple shortcuts and naturally capture the intrinsic correlations. Experimental results demonstrate that our method induces better convergence properties during training, and achieves state-of-the-art generalization performance on both synthetic and real-world biased datasets.
翻译:近期研究注意到一种名为神经崩溃的有趣现象,即当神经网络在特征空间与训练目标之间建立正确的关联时,其最后一层特征连同分类器权重会坍缩为稳定对称的结构。本文将神经崩溃的研究扩展至具有不平衡属性的有偏数据集。我们观察到模型在训练初期极易陷入捷径学习的陷阱,形成有偏且非坍缩的特征空间,这种状态难以逆转并限制了泛化能力。为消除有偏分类的根本原因,我们借鉴初始训练的最新思路,提出一种无需额外训练复杂度的避捷径学习框架。通过基于神经崩溃结构精心设计的捷径初值,模型被激励放弃对简单捷径的追求,自然捕捉内在关联。实验结果表明,我们的方法在训练过程中具有更好的收敛特性,并在合成及真实有偏数据集上均实现了最先进的泛化性能。