In recent years neural networks have achieved impressive results on many technological and scientific tasks. Yet, the mechanism through which these models automatically select features, or patterns in data, for prediction remains unclear. Identifying such a mechanism is key to advancing performance and interpretability of neural networks and promoting reliable adoption of these models in scientific applications. In this paper, we identify and characterize the mechanism through which deep fully connected neural networks learn features. We posit the Deep Neural Feature Ansatz, which states that neural feature learning occurs by implementing the average gradient outer product to up-weight features strongly related to model output. Our ansatz sheds light on various deep learning phenomena including emergence of spurious features and simplicity biases and how pruning networks can increase performance, the "lottery ticket hypothesis." Moreover, the mechanism identified in our work leads to a backpropagation-free method for feature learning with any machine learning model. To demonstrate the effectiveness of this feature learning mechanism, we use it to enable feature learning in classical, non-feature learning models known as kernel machines and show that the resulting models, which we refer to as Recursive Feature Machines, achieve state-of-the-art performance on tabular data.
翻译:近年来,神经网络已在众多科技与科学任务中取得令人瞩目的成果。然而,这些模型自动选择特征(即数据中的模式)用于预测的机制仍不明确。阐明这一机制对于提升神经网络的性能与可解释性,以及推动这些模型在科学应用中的可靠采用至关重要。本文识别并刻画了深度全连接神经网络特征学习的内在机制。我们提出深度神经特征猜想,指出神经特征学习通过计算平均梯度外积来实现,从而强化与模型输出高度相关的特征。该猜想解释了多种深度学习现象,包括虚假特征的出现、简单性偏好,以及剪枝网络如何提升性能(即"彩票假设")。此外,本文所识别的机制衍生出一种适用于任意机器学习模型的无反向传播特征学习方法。为验证该特征学习机制的有效性,我们将其应用于经典的非特征学习模型——核机器,由此得到的递归特征机器模型在表格数据上达到了当前最优性能。