Many feedforward neural networks (NNs) generate continuous and piecewise-linear (CPWL) mappings. Specifically, they partition the input domain into regions on which the mapping is affine. The number of these so-called linear regions offers a natural metric to characterize the expressiveness of CPWL NNs. The precise determination of this quantity is often out of reach in practice, and bounds have been proposed for specific architectures, including for ReLU and Maxout NNs. In this work, we generalize these bounds to NNs with arbitrary and possibly multivariate CPWL activation functions. We first provide upper and lower bounds on the maximal number of linear regions of a CPWL NN given its depth, width, and the number of linear regions of its activation functions. Our results rely on the combinatorial structure of convex partitions and confirm the distinctive role of depth which, on its own, is able to exponentially increase the number of regions. We then introduce a complementary stochastic framework to estimate the average number of linear regions produced by a CPWL NN. Under reasonable assumptions, the expected density of linear regions along any 1D path is bounded by the product of depth, width, and a measure of activation complexity (up to a scaling factor). This yields an identical role to the three sources of expressiveness: no exponential growth with depth is observed anymore.
翻译:许多前馈神经网络生成连续分段线性映射。具体而言,它们将输入域划分为多个区域,在这些区域上映射是仿射的。这些所谓线性区域的数量为刻画连续分段线性神经网络的表达能力提供了自然度量。精确确定该数量在实践中往往难以实现,目前已针对特定架构(包括ReLU和Maxout神经网络)提出了相应的上下界。本研究将这类界限推广至具有任意且可能多元分段线性激活函数的神经网络。首先,我们基于网络深度、宽度及激活函数线性区域数量,给出了连续分段线性神经网络最大线性区域数量的上下界。该结果依赖于凸划分的组合结构,证实了深度在单独作用下能指数级增加区域数量的独特作用。其次,我们引入互补的随机框架来估计连续分段线性神经网络产生的平均线性区域数量。在合理假设下,沿任意一维路径的线性区域期望密度受深度、宽度及激活函数复杂度度量(经缩放因子调整)的乘积约束。这表明三种表达能力来源具有相同作用:此时不再观测到随深度指数增长的现象。