We study the distribution of a fully connected neural network with random Gaussian weights and biases in which the hidden layer widths are proportional to a large constant $n$. Under mild assumptions on the non-linearity, we obtain quantitative bounds on normal approximations valid at large but finite $n$ and any fixed network depth. Our theorems show both for the finite-dimensional distributions and the entire process, that the distance between a random fully connected network (and its derivatives) to the corresponding infinite width Gaussian process scales like $n^{-\gamma}$ for $\gamma>0$, with the exponent depending on the metric used to measure discrepancy. Our bounds are strictly stronger in terms of their dependence on network width than any previously available in the literature; in the one-dimensional case, we also prove that they are optimal, i.e., we establish matching lower bounds.
翻译:我们研究具有随机高斯权重和偏置的全连接神经网络的分布,其中隐藏层宽度与一个大常数$n$成正比。在关于非线性的温和假设下,我们获得了在大的有限$n$和任意固定网络深度下有效的正态近似定量界。我们的定理证明,无论是有限维分布还是整个过程,随机全连接网络(及其导数)与对应无限宽度高斯过程之间的距离以$n^{-\gamma}$($\gamma>0$)的量级衰减,其中指数取决于用于度量差异的指标。我们的界在网络宽度依赖关系方面严格优于文献中先前可用的结果;在一维情况下,我们还证明它们是最优的,即我们建立了匹配的下界。