Deep Neural Collapse (DNC) refers to the surprisingly rigid structure of the data representations in the final layers of Deep Neural Networks (DNNs). Though the phenomenon has been measured in a wide variety of settings, its emergence is only partially understood. In this work, we provide substantial evidence that DNC formation occurs primarily through deep feature learning with the average gradient outer product (AGOP). This takes a step further compared to efforts that explain neural collapse via feature-agnostic approaches, such as the unconstrained features model. We proceed by providing evidence that the right singular vectors and values of the weights are responsible for the majority of within-class variability collapse in DNNs. As shown in recent work, this singular structure is highly correlated with that of the AGOP. We then establish experimentally and theoretically that AGOP induces neural collapse in a randomly initialized neural network. In particular, we demonstrate that Deep Recursive Feature Machines, a method originally introduced as an abstraction for AGOP feature learning in convolutional neural networks, exhibits DNC.
翻译:深度神经坍缩(DNC)指深度神经网络(DNN)最终层数据表示呈现出的惊人刚性结构。尽管该现象已在多种场景中被观测到,但其形成机制仍未被完全理解。本研究提供了实质性证据,表明DNC的形成主要源于通过平均梯度外积(AGOP)进行的深度特征学习。与试图通过特征无关方法(如无约束特征模型)解释神经坍缩的现有工作相比,本研究迈出了更进一步。我们通过实验证明,权重矩阵的右奇异向量与奇异值是导致DNN中类内变异性坍缩的主导因素。正如近期研究所示,这种奇异结构与AGOP的奇异结构高度相关。我们进一步通过理论与实验验证了AGOP能够诱导随机初始化神经网络产生神经坍缩。具体而言,我们证明最初作为卷积神经网络中AGOP特征学习抽象方法提出的深度递归特征机(Deep Recursive Feature Machines)具备DNC特性。