We study the distribution of a fully connected neural network with random Gaussian weights and biases in which the hidden layer widths are proportional to a large constant $n$. Under mild assumptions on the non-linearity, we obtain quantitative bounds on normal approximations valid at large but finite $n$ and any fixed network depth. Our theorems show both for the finite-dimensional distributions and the entire process, that the distance between a random fully connected network (and its derivatives) to the corresponding infinite width Gaussian process scales like $n^{-\gamma}$ for $\gamma>0$, with the exponent depending on the metric used to measure discrepancy. Our bounds are strictly stronger in terms of their dependence on network width than any previously available in the literature; in the one-dimensional case, we also prove that they are optimal, i.e., we establish matching lower bounds.
翻译:我们研究了具有随机高斯权重和偏置的全连接神经网络的分布,其中隐藏层宽度与一个大的常数$n$成比例。在非线性的温和假设下,我们获得了在$n$有限且固定网络深度下适用于正态近似的定量界。我们的定理表明,对于有限维分布和整个随机过程,随机全连接网络(及其导数)与相应无限宽高斯过程之间的距离以$n^{-\gamma}$($\gamma>0$)的速率衰减,其中指数依赖于衡量差异的度量。与文献中先前所有结果相比,我们的界在网络宽度依赖性上更为严格;在一维情形下,我们还证明了这些界是最优的,即建立了匹配的下界。