Multilayer neural networks set the current state of the art for many technical classification problems. But, these networks are still, essentially, black boxes in terms of analyzing them and predicting their performance. Here, we develop a statistical theory for the one-layer perceptron and show that it can predict performances of a surprisingly large variety of neural networks with different architectures. A general theory of classification with perceptrons is developed by generalizing an existing theory for analyzing reservoir computing models and connectionist models for symbolic reasoning known as vector symbolic architectures. Our statistical theory offers three formulas leveraging the signal statistics with increasing detail. The formulas are analytically intractable, but can be evaluated numerically. The description level that captures maximum details requires stochastic sampling methods. Depending on the network model, the simpler formulas already yield high prediction accuracy. The quality of the theory predictions is assessed in three experimental settings, a memorization task for echo state networks (ESNs) from reservoir computing literature, a collection of classification datasets for shallow randomly connected networks, and the ImageNet dataset for deep convolutional neural networks. We find that the second description level of the perceptron theory can predict the performance of types of ESNs, which could not be described previously. The theory can predict deep multilayer neural networks by being applied to their output layer. While other methods for prediction of neural networks performance commonly require to train an estimator model, the proposed theory requires only the first two moments of the distribution of the postsynaptic sums in the output neurons. The perceptron theory compares favorably to other methods that do not rely on training an estimator model.
翻译:多层神经网络在众多技术分类问题中确立了当前最先进的性能。然而,这些网络在性能分析与预测方面本质上仍是黑箱。本文为一层感知机建立了统计理论,并证明该理论能预测多种架构迥异的神经网络的性能。通过泛化现有用于分析储层计算模型与基于向量符号架构的符号推理联结主义模型的理论,我们发展出适用于感知机分类的通用理论框架。该统计理论提供了三个利用信号统计特性的公式,其细节层级逐级递增。这些公式虽无法解析求解,但可进行数值评估。捕获最大细节的描述层级需要随机采样方法。根据网络模型的不同,简化公式已能提供高精度的预测结果。我们通过三类实验设置评估理论预测质量:储层计算文献中回声状态网络(ESNs)的记忆任务、浅层随机连接网络的分类数据集集、以及深度卷积神经网络的ImageNet数据集。研究发现,感知机理论的第二描述层级可预测此前无法刻画的ESN类型性能。该理论通过作用于输出层,能够预测深度多层神经网络。尽管其他神经网络性能预测方法通常需要训练估计器模型,本理论仅需输出神经元中突触后和分布的前二阶矩。与不依赖训练估计器模型的其他方法相比,感知机理论展现出显著优势。