Understanding the mechanisms through which neural networks extract statistics from input-label pairs is one of the most important unsolved problems in supervised learning. Prior works have identified that the gram matrices of the weights in trained neural networks of general architectures are proportional to the average gradient outer product of the model, in a statement known as the Neural Feature Ansatz (NFA). However, the reason these quantities become correlated during training is poorly understood. In this work, we explain the emergence of this correlation. We identify that the NFA is equivalent to alignment between the left singular structure of the weight matrices and a significant component of the empirical neural tangent kernels associated with those weights. We establish that the NFA introduced in prior works is driven by a centered NFA that isolates this alignment. We show that the speed of NFA development can be predicted analytically at early training times in terms of simple statistics of the inputs and labels. Finally, we introduce a simple intervention to increase NFA correlation at any given layer, which dramatically improves the quality of features learned.
翻译:理解神经网络从输入-标签对中提取统计特征的机制是监督学习中最重要且尚未解决的核心问题之一。先前研究指出,在训练后的通用架构神经网络中,权重的格拉姆矩阵与模型平均梯度外积成正比,这一结论被称为神经特征假说(Neural Feature Ansatz, NFA)。然而,这些量在训练过程中产生相关性的原因尚不明确。本研究阐释了这种相关性的形成机制。我们揭示NFA等价于权重矩阵的左奇异结构与这些权重相关的经验神经正切核显著分量之间的对齐。我们证明先前研究提出的NFA由一种隔离该对齐的中心化NFA驱动。研究还表明,在训练早期,可通过输入和标签的简单统计量从解析角度预测NFA发展速度。最后,我们提出一种简单的干预方法以增强任意指定层的NFA相关性,该方法能显著提升所学特征的质量。