The development of effective initialization methods requires an understanding of random neural networks. In this work, a rigorous probabilistic analysis of deep unbiased Leaky ReLU networks is provided. We prove a Law of Large Numbers and a Central Limit Theorem for the logarithm of the norm of network activations, establishing that, as the number of layers increases, their growth is governed by a parameter called the Lyapunov exponent. This parameter characterizes a sharp phase transition between vanishing and exploding activations, and we calculate the Lyapunov exponent explicitly for Gaussian or orthogonal weight matrices. Our results reveal that standard methods, such as He initialization or orthogonal initialization, do not guarantee activation stabilty for deep networks of low width. Based on these theoretical insights, we propose a novel initialization method, referred to as Lyapunov initialization, which sets the Lyapunov exponent to zero and thereby ensures that the neural network is as stable as possible, leading empirically to improved learning.
翻译:有效初始化方法的发展需要对随机神经网络有深入理解。本工作对深度无偏Leaky ReLU网络进行了严格的概率分析。我们证明了网络激活范数对数的强大数定律与中心极限定理,确立了随着网络层数增加,其增长由称为李雅普诺夫指数的参数所主导。该参数刻画了激活值消失与爆炸之间的尖锐相变,我们针对高斯或正交权重矩阵显式计算了李雅普诺夫指数。研究结果表明,对于低宽度的深度网络,标准方法(如He初始化或正交初始化)并不能保证激活稳定性。基于这些理论洞见,我们提出了一种新颖的初始化方法——李雅普诺夫初始化,该方法将李雅普诺夫指数设为零,从而确保神经网络达到最大可能的稳定性,经验证该方法能有效提升学习性能。