We prove a large deviation principle for deep neural networks with Gaussian weights and at most linearly growing activation functions, such as ReLU. This generalises earlier work, in which bounded and continuous activation functions were considered. In practice, linearly growing activation functions such as ReLU are most commonly used. We furthermore simplify previous expressions for the rate function and provide a power-series expansions for the ReLU case.
翻译:我们证明了具有高斯权重和至多线性增长激活函数(如ReLU)的深度神经网络的大偏差原理。这推广了早期考虑有界连续激活函数的研究工作。在实践中,线性增长的激活函数(如ReLU)最为常用。我们进一步简化了速率函数的先前表达式,并为ReLU情形提供了幂级数展开。