We prove a large deviation principle for deep neural networks with Gaussian weights and (at most linearly growing) activation functions. This generalises earlier work, in which bounded and continuous activation functions were considered. In practice, linearly growing activation functions such as ReLU are most commonly used. We furthermore simplify previous expressions for the rate function and a give power-series expansions for the ReLU case.
翻译:我们证明了具有高斯权重和(至多线性增长)激活函数的深度神经网络的大偏差原理。这推广了先前的工作,其中考虑了有界且连续的激活函数。在实践中,最常用的是线性增长的激活函数,如ReLU。此外,我们简化了先前关于速率函数的表达式,并给出了ReLU情况下的幂级数展开。