While suitably scaled CNNs with Gaussian initialization are known to converge to Gaussian processes as the number of channels diverges, little is known beyond this Gaussian limit. We establish a large deviation principle (LDP) for convolutional neural networks in the infinite-channel regime. We consider a broad class of multidimensional CNN architectures characterized by general receptive fields encoded through a patch-extractor function satisfying mild structural assumptions. Our main result establishes a large deviation principle (LDP) for the sequence of conditional covariance matrices under Gaussian prior distribution on the weights. We further derive an LDP for the posterior distribution obtained by conditioning on a finite number of observations. In addition, we provide a streamlined proof of the concentration of the conditional covariances and of the Gaussian equivalence of the network. To the best of our knowledge, this is the first large deviation principle established for convolutional neural networks.
翻译:尽管已知适当缩放的高斯初始化卷积神经网络(CNN)在通道数趋于无穷时会收敛于高斯过程,但关于这一高斯极限之外的情况所知甚少。本文为无限通道机制下的卷积神经网络建立了一个大偏差原理(LDP)。我们考虑了一类广泛的多维CNN架构,其特征由满足温和结构假设的补丁提取函数所编码的通用感受野来刻画。我们的主要结果是在权重的高斯先验分布下,为条件协方差矩阵序列建立了一个大偏差原理。我们进一步推导了通过对有限数量观测值进行条件化所得后验分布的大偏差原理。此外,我们提供了条件协方差集中性以及网络高斯等价性的简化证明。据我们所知,这是首次为卷积神经网络建立的大偏差原理。