We study the Finite-Dimensional Distributions (FDDs) of deep neural networks with randomly initialized weights that have finite-order moments. Specifically, we establish Gaussian approximation bounds in the Wasserstein-$1$ norm between the FDDs and their Gaussian limit assuming a Lipschitz activation function and allowing the layer widths to grow to infinity at arbitrary relative rates. In the special case where all widths are proportional to a common scale parameter $n$ and there are $L-1$ hidden layers, we obtain convergence rates of order $n^{-({1}/{6})^{L-1} + ε}$, for any $ε> 0$.
翻译:本研究探讨了具有有限阶矩的随机初始化权重深度神经网络的有限维分布特性。具体而言,在激活函数满足Lipschitz连续性的条件下,我们建立了有限维分布与其高斯极限之间在Wasserstein-$1$距离下的高斯逼近界,同时允许网络各层宽度以任意相对速率趋于无穷。特别地,当所有隐藏层宽度与公共尺度参数$n$成正比且网络包含$L-1$个隐藏层时,我们得到了$n^{-({1}/{6})^{L-1} + ε}$阶的收敛速率(其中$ε> 0$为任意常数)。