Feature learning in neural networks and kernel machines that recursively learn features

Neural networks have achieved impressive results on many technological and scientific tasks. Yet, their empirical successes have outpaced our fundamental understanding of their structure and function. Identifying mechanisms driving the successes of neural networks can provide principled approaches for improving neural network performance and developing simple and effective alternatives. In this work, we isolate a key mechanism driving feature learning in fully connected neural networks by connecting neural feature learning to a statistical estimator known as average gradient outer product. We subsequently leverage this mechanism to design \textit{Recursive Feature Machines} (RFMs), which are kernel machines that learn features. We show that RFMs (1) accurately capture features learned by deep fully connected neural networks, and (2) outperform a broad spectrum of models including neural networks on tabular data. Furthermore, we show how RFMs shed light on recently observed deep learning phenomena including grokking, lottery tickets, simplicity biases, and spurious features. We provide a Python implementation to make our method easily accessible [\url{https://github.com/aradha/recursive_feature_machines}].

翻译：神经网络已在众多科技任务中取得令人瞩目的成果，然而其实证成功已超越对其结构与功能的基础性理解。识别驱动神经网络成功的机制，可为提升性能及开发简洁有效的替代方案提供原则性方法。本研究通过将神经网络特征学习与一种称为平均梯度外积的统计估计量相关联，分离出全连接神经网络中特征学习的关键机制。进而利用该机制设计出递归特征机器（RFMs）——一种能够学习特征的核机器。研究表明，RFMs能（1）精确捕捉深度全连接神经网络所学特征，且（2）在表格数据上超越包括神经网络在内的广泛模型。此外，我们揭示RFMs如何阐明近期观测到的深度学习现象，包括grokking、彩票假说、简单性偏好及伪特征。我们提供Python实现以便于方法获取[网址：\url{https://github.com/aradha/recursive_feature_machines}]。

相关内容

Neural Networks

关注 1654

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日