Recent work has observed an intriguing ''Neural Collapse'' phenomenon in well-trained neural networks, where the last-layer representations of training samples with the same label collapse into each other. This appears to suggest that the last-layer representations are completely determined by the labels, and do not depend on the intrinsic structure of input distribution. We provide evidence that this is not a complete description, and that the apparent collapse hides important fine-grained structure in the representations. Specifically, even when representations apparently collapse, the small amount of remaining variation can still faithfully and accurately captures the intrinsic structure of input distribution. As an example, if we train on CIFAR-10 using only 5 coarse-grained labels (by combining two classes into one super-class) until convergence, we can reconstruct the original 10-class labels from the learned representations via unsupervised clustering. The reconstructed labels achieve $93\%$ accuracy on the CIFAR-10 test set, nearly matching the normal CIFAR-10 accuracy for the same architecture. We also provide an initial theoretical result showing the fine-grained representation structure in a simplified synthetic setting. Our results show concretely how the structure of input data can play a significant role in determining the fine-grained structure of neural representations, going beyond what Neural Collapse predicts.
翻译:摘要:近期研究在良好训练的神经网络中观察到一种引人注目的"神经崩溃"现象,即训练样本中相同标签的最后一层表征相互坍缩。这似乎表明最后一层表征完全由标签决定,且不依赖于输入分布的内在结构。我们证明这一描述并不完备,且看似坍缩的表征中仍隐藏着重要的精细结构。具体而言,即使表征表面坍缩,剩余微小的变异仍能忠实且精确地捕捉输入分布的内在结构。以CIFAR-10数据集为例,仅使用5个粗粒度标签(通过将两个类别合并为一个超类)训练至收敛后,我们可通过无监督聚类从习得的表征中重建出原始的10个类别标签。重建标签在CIFAR-10测试集上达到93%的准确率,几乎与相同架构下常规CIFAR-10的准确率持平。我们还在简化的合成场景中提供了初步理论结果,证明了表征精细结构的存在。我们的研究具体展示了输入数据结构如何在决定神经表征精细结构中发挥关键作用,其影响远超"神经崩溃"理论的预言。