A theory of data variability in Neural Network Bayesian inference

Bayesian inference and kernel methods are well established in machine learning. The neural network Gaussian process in particular provides a concept to investigate neural networks in the limit of infinitely wide hidden layers by using kernel and inference methods. Here we build upon this limit and provide a field-theoretic formalism which covers the generalization properties of infinitely wide networks. We systematically compute generalization properties of linear, non-linear, and deep non-linear networks for kernel matrices with heterogeneous entries. In contrast to currently employed spectral methods we derive the generalization properties from the statistical properties of the input, elucidating the interplay of input dimensionality, size of the training data set, and variability of the data. We show that data variability leads to a non-Gaussian action reminiscent of a ($\varphi^3+\varphi^4$)-theory. Using our formalism on a synthetic task and on MNIST we obtain a homogeneous kernel matrix approximation for the learning curve as well as corrections due to data variability which allow the estimation of the generalization properties and exact results for the bounds of the learning curves in the case of infinitely many training data points.

翻译：贝叶斯推断与核方法在机器学习领域已得到充分确立。其中，神经网络高斯过程通过核方法与推断技术为研究无限宽隐含层极限下的神经网络提供了方法论基础。本文在该极限框架下建立场论形式体系，系统刻画了无限宽网络的泛化特性。针对具有非均匀元素的核矩阵，我们系统计算了线性网络、非线性网络及深度非线性网络的泛化性能。与现有谱方法不同，我们通过输入数据的统计特性推导泛化特性，阐明了输入维度、训练数据集规模与数据变异性三者间的相互作用机理。研究表明，数据变异性将导致类似于(φ³+φ⁴)理论的高斯型行为。在合成任务与MNIST数据集上的实验表明，通过我们的形式体系可获得学习曲线的均匀核矩阵近似，同时数据变异性修正项可实现泛化性能的估算，并在训练数据点趋于无穷时得到学习曲线边界的精确解。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日