We formalize and interpret the geometric structure of $d$-dimensional fully connected ReLU-layers in neural networks. The parameters of a ReLU-layer induce a natural partition of the input domain, such that in each sector of the partition, the ReLU-layer can be greatly simplified. This leads to a geometric interpretation of a ReLU-layer as a projection onto a polyhedral cone followed by an affine transformation, in line with the description in [doi:10.48550/arXiv.1905.08922] for convolutional networks with ReLU activations. Further, this structure facilitates simplified expressions for preimages of the intersection between partition sectors and hyperplanes, which is useful when describing decision boundaries in a classification setting. We investigate this in detail for a feed-forward network with one hidden ReLU-layer, where we provide results on the geometric complexity of the decision boundary generated by such networks, as well as proving that modulo an affine transformation, such a network can only generate $d$ different decision boundaries. Finally, the effect of adding more layers to the network is discussed.
翻译:我们形式化并解释了神经网络中d维全连接ReLU层的几何结构。ReLU层的参数会自然地诱导出输入域的一个划分,使得在每个划分扇区内,ReLU层可以大幅简化。这引出了ReLU层的一个几何解释:首先投影到一个多面体锥上,再进行仿射变换,这与文献[doi:10.48550/arXiv.1905.08922]中对ReLU激活卷积网络的描述一致。此外,这种结构有助于简化划分扇区与超平面交集的预像表达式,这在描述分类任务中的决策边界时非常有用。我们对包含一个隐藏ReLU层的前馈网络进行了详细研究,给出了此类网络生成的决策边界的几何复杂性结果,并证明在模去仿射变换后,此类网络只能生成d种不同的决策边界。最后,讨论了在网络中添加更多层的影响。