We formalize and interpret the geometric structure of $d$-dimensional fully connected ReLU layers in neural networks. The parameters of a ReLU layer induce a natural partition of the input domain, such that the ReLU layer can be significantly simplified in each sector of the partition. This leads to a geometric interpretation of a ReLU layer as a projection onto a polyhedral cone followed by an affine transformation, in line with the description in [doi:10.48550/arXiv.1905.08922] for convolutional networks with ReLU activations. Further, this structure facilitates simplified expressions for preimages of the intersection between partition sectors and hyperplanes, which is useful when describing decision boundaries in a classification setting. We investigate this in detail for a feed-forward network with one hidden ReLU-layer, where we provide results on the geometric complexity of the decision boundary generated by such networks, as well as proving that modulo an affine transformation, such a network can only generate $d$ different decision boundaries. Finally, the effect of adding more layers to the network is discussed.
翻译:我们形式化并解释了神经网络中$d$维全连接ReLU层的几何结构。ReLU层的参数会自然诱导输入域的一种划分,使得ReLU层在每个划分扇区内可以显著简化。这导致ReLU层被几何解释为投影到多面体锥后再经过仿射变换的过程,与[doi:10.48550/arXiv.1905.08922]中对带ReLU激活函数的卷积网络的描述一致。此外,该结构有助于简化划分扇区与超平面交集的原像表达式,这在描述分类问题中的决策边界时十分有用。我们针对含单个隐藏ReLU层的前馈网络进行了详细研究,提供了此类网络生成的决策边界几何复杂性的结果,并证明在仿射变换下,此类网络只能生成$d$种不同的决策边界。最后,我们讨论了增加网络层数的影响。