While suitably scaled CNNs with Gaussian initialization are known to converge to Gaussian processes as the number of channels diverges, little is known beyond this Gaussian limit. We establish a large deviation principle (LDP) for convolutional neural networks in the infinite-channel regime. We consider a broad class of multidimensional CNN architectures characterized by general receptive fields encoded through a patch-extractor function satisfying mild structural assumptions. Our main result establishes a large deviation principle (LDP) for the sequence of conditional covariance matrices under Gaussian prior distribution on the weights. We further derive an LDP for the posterior distribution obtained by conditioning on a finite number of observations. In addition, we provide a streamlined proof of the concentration of the conditional covariances and of the Gaussian equivalence of the network. To the best of our knowledge, this is the first large deviation principle established for convolutional neural networks.
翻译:尽管已知适当缩放且具有高斯初始化的CNN在通道数趋于无穷时收敛到高斯过程,但关于这一高斯极限之外的情况却知之甚少。我们在无限通道机制下建立了卷积神经网络的极大偏差原理(LDP)。考虑一类广泛的多维CNN架构,其特征是通过满足温和结构假设的补丁提取函数编码的一般感受野。我们的主要结果为在权重服从高斯先验分布条件下,条件协方差矩阵序列建立了极大偏差原理。进一步推导了基于有限观测值进行条件化得到的后验分布的LDP。此外,我们还提供了条件协方差集中性和网络高斯等价性的简化证明。据我们所知,这是首次为卷积神经网络建立的极大偏差原理。