Effective initialization in deep networks requires an understanding of random neural networks. In this work, a rigorous probabilistic analysis of deep bias-free random Leaky ReLU networks is provided. We prove a Law of Large Numbers and a Central Limit Theorem for the logarithm of the norm of network activations, establishing that, as the number of layers increases, their growth is governed by a parameter called the Lyapunov exponent. This parameter characterizes a sharp phase transition between vanishing and exploding activations, and we calculate the Lyapunov exponent explicitly for Gaussian or orthogonal weight matrices. Our results reveal that standard methods, such as He initialization or orthogonal initialization, do not guarantee activation stability for deep networks of low width. Based on these theoretical insights, we propose a novel initialization method, referred to as Lyapunov initialization, which sets the Lyapunov exponent to zero and thereby ensures that the neural network is as stable as possible, leading empirically to improved learning.
翻译:深度网络的有效初始化需要理解随机神经网络。本文对无偏置的深度随机Leaky ReLU网络提供了严格的概率分析。我们证明了网络激活值范数对数的强大数定律与中心极限定理,确立了随着层数增加,其增长由称为Lyapunov指数的参数所支配。该参数刻画了激活值消失与爆炸之间的尖锐相变,并针对高斯或正交权重矩阵显式计算了Lyapunov指数。我们的研究揭示,标准方法(如He初始化或正交初始化)无法保证低宽度深度网络的激活稳定性。基于这些理论见解,我们提出了一种新的初始化方法,即Lyapunov初始化,通过将Lyapunov指数设为零,确保神经网络尽可能稳定,并在经验上提升了学习效果。