Topological obstruction to the training of shallow ReLU neural networks

Studying the interplay between the geometry of the loss landscape and the optimization trajectories of simple neural networks is a fundamental step for understanding their behavior in more complex settings. This paper reveals the presence of topological obstruction in the loss landscape of shallow ReLU neural networks trained using gradient flow. We discuss how the homogeneous nature of the ReLU activation function constrains the training trajectories to lie on a product of quadric hypersurfaces whose shape depends on the particular initialization of the network's parameters. When the neural network's output is a single scalar, we prove that these quadrics can have multiple connected components, limiting the set of reachable parameters during training. We analytically compute the number of these components and discuss the possibility of mapping one to the other through neuron rescaling and permutation. In this simple setting, we find that the non-connectedness results in a topological obstruction, which, depending on the initialization, can make the global optimum unreachable. We validate this result with numerical experiments.

翻译：研究简单神经网络损失函数景观的几何特性与优化轨迹之间的相互作用，是理解其在更复杂场景中行为的基础性步骤。本文揭示了使用梯度流训练的浅层ReLU神经网络损失函数景观中存在拓扑障碍。我们讨论了ReLU激活函数的齐次性如何将训练轨迹约束在二次超曲面的乘积上，这些曲面的形状取决于网络参数的具体初始化方式。当神经网络输出为单标量时，我们证明这些二次曲面可能具有多个连通分支，从而限制了训练过程中可到达的参数集合。我们解析计算了这些分支的数量，并讨论了通过神经元重缩放与置换实现分支间映射的可能性。在此简单设定下，我们发现非连通性会导致拓扑障碍，根据初始化方式的不同，该障碍可能使全局最优解无法到达。我们通过数值实验验证了这一结论。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

Google研究院提出FixMatch，简单粗暴却极其有效的半监督学习方法，附14页PDF下载

专知会员服务

54+阅读 · 2020年1月24日