Feature emergence via margin maximization: case studies in algebraic tasks

Understanding the internal representations learned by neural networks is a cornerstone challenge in the science of machine learning. While there have been significant recent strides in some cases towards understanding how neural networks implement specific target functions, this paper explores a complementary question -- why do networks arrive at particular computational strategies? Our inquiry focuses on the algebraic learning tasks of modular addition, sparse parities, and finite group operations. Our primary theoretical findings analytically characterize the features learned by stylized neural networks for these algebraic tasks. Notably, our main technique demonstrates how the principle of margin maximization alone can be used to fully specify the features learned by the network. Specifically, we prove that the trained networks utilize Fourier features to perform modular addition and employ features corresponding to irreducible group-theoretic representations to perform compositions in general groups, aligning closely with the empirical observations of Nanda et al. and Chughtai et al. More generally, we hope our techniques can help to foster a deeper understanding of why neural networks adopt specific computational strategies.

翻译：理解神经网络学习的内部表征是机器学习科学中的核心挑战。尽管近年来在某些案例中，人们对神经网络如何实现特定目标函数取得了重大进展，但本文探讨了一个互补性问题——神经网络为何会采用特定的计算策略？我们的研究聚焦于模加法、稀疏奇偶校验和有限群运算等代数学习任务。我们的主要理论发现通过分析表征了这些代数任务中规范化神经网络所学习的特征。值得注意的是，我们的核心技术证明了仅凭边际最大化原理即可完全指定网络学习的特征。具体而言，我们证明训练后的网络利用傅里叶特征执行模加法，并采用与不可约群表示对应的特征来执行一般群上的复合运算，这与Nanda等人和Chughtai等人的实证观察高度吻合。更广泛地说，我们希望我们的技术能够帮助加深对神经网络为何采用特定计算策略的理解。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【TPAMI2020】目标检测中的不平衡问题:综述论文，34页pdf

专知会员服务

55+阅读 · 2020年3月16日

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

专知会员服务

34+阅读 · 2020年1月15日