The choice of architecture of a neural network influences which functions will be realizable by that neural network and, as a result, studying the expressiveness of a chosen architecture has received much attention. In ReLU neural networks, the presence of stably unactivated neurons can reduce the network's expressiveness. In this work, we investigate the probability of a neuron in the second hidden layer of such neural networks being stably unactivated when the weights and biases are initialized from symmetric probability distributions. For networks with input dimension $n_0$, we prove that if the first hidden layer has $n_0+1$ neurons then this probability is exactly $\frac{2^{n_0}+1}{4^{n_0+1}}$, and if the first hidden layer has $n_1$ neurons, $n_1 \le n_0$, then the probability is $\frac{1}{2^{n_1+1}}$. Finally, for the case when the first hidden layer has more neurons than $n_0+1$, a conjecture is proposed along with the rationale. Computational evidence is presented to support the conjecture.
翻译:神经网络架构的选择决定了该网络可实现哪些函数,因此,研究特定架构的表达能力受到了广泛关注。在ReLU神经网络中,稳定未激活神经元的存在会降低网络的表达能力。本文研究了当权重和偏置从对称概率分布初始化时,此类神经网络第二隐藏层中神经元处于稳定未激活状态的概率。对于输入维度为$n_0$的网络,我们证明:若第一隐藏层具有$n_0+1$个神经元,则该概率精确等于$\frac{2^{n_0}+1}{4^{n_0+1}}$;若第一隐藏层具有$n_1$个神经元且$n_1 \le n_0$,则概率为$\frac{1}{2^{n_1+1}}$。最后,针对第一隐藏层神经元数量超过$n_0+1$的情况,本文提出了一个猜想及其理论依据,并提供了计算证据支持该猜想。