Memorization-Dilation: Modeling Neural Collapse Under Label Noise

The notion of neural collapse refers to several emergent phenomena that have been empirically observed across various canonical classification problems. During the terminal phase of training a deep neural network, the feature embedding of all examples of the same class tend to collapse to a single representation, and the features of different classes tend to separate as much as possible. Neural collapse is often studied through a simplified model, called the unconstrained feature representation, in which the model is assumed to have "infinite expressivity" and can map each data point to any arbitrary representation. In this work, we propose a more realistic variant of the unconstrained feature representation that takes the limited expressivity of the network into account. Empirical evidence suggests that the memorization of noisy data points leads to a degradation (dilation) of the neural collapse. Using a model of the memorization-dilation (M-D) phenomenon, we show one mechanism by which different losses lead to different performances of the trained network on noisy data. Our proofs reveal why label smoothing, a modification of cross-entropy empirically observed to produce a regularization effect, leads to improved generalization in classification tasks.

翻译：神经坍缩是指在多种经典分类问题中实验观察到的几种涌现现象。在深度神经网络训练的最终阶段，同一类别的所有样本的特征嵌入倾向于坍缩为单一表示，而不同类别的特征则尽可能分离。神经坍缩通常通过一个称为无约束特征表示的简化模型进行研究，该模型假设网络具有“无限表达能力”，能够将每个数据点映射到任意表示。在本工作中，我们提出了一种更符合实际的无约束特征表示变体，该变体考虑了网络有限表达能力的影响。实验证据表明，对噪声数据点的记忆会导致神经坍缩的退化（膨胀）。利用记忆-膨胀（M-D）现象的模型，我们展示了一种机制，通过该机制不同损失函数会导致训练后的网络在噪声数据上表现出不同性能。我们的证明揭示了为何标签平滑（一种交叉熵的修改，实验观察显示其具有正则化效果）能在分类任务中提升泛化能力。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日