We introduce a probability distribution, combined with an efficient sampling algorithm, for weights and biases of fully-connected neural networks. In a supervised learning context, no iterative optimization or gradient computations of internal network parameters are needed to obtain a trained network. The sampling is based on the idea of random feature models. However, instead of a data-agnostic distribution, e.g., a normal distribution, we use both the input and the output training data to sample shallow and deep networks. We prove that sampled networks are universal approximators. For Barron functions, we show that the $L^2$-approximation error of sampled shallow networks decreases with the square root of the number of neurons. Our sampling scheme is invariant to rigid body transformations and scaling of the input data, which implies many popular pre-processing techniques are not required. In numerical experiments, we demonstrate that sampled networks achieve accuracy comparable to iteratively trained ones, but can be constructed orders of magnitude faster. Our test cases involve a classification benchmark from OpenML, sampling of neural operators to represent maps in function spaces, and transfer learning using well-known architectures.
翻译:我们提出了一种结合高效采样算法的概率分布,用于全连接神经网络的权重和偏置。在监督学习背景下,无需通过迭代优化或计算网络内部参数的梯度即可获得训练好的网络。该采样方法基于随机特征模型的思想,但不同于数据无关的分布(如正态分布),我们同时利用输入和输出训练数据对浅层和深层网络进行采样。我们证明了采样网络具有通用逼近性。对于Barron函数,我们证明采样浅层网络的$L^2$逼近误差随神经元数量的平方根递减。该采样方案对输入数据的刚体变换和缩放具有不变性,这意味着许多流行的预处理技术并非必需。数值实验表明,采样网络能达到与迭代训练网络相当的精度,但构建速度可提升数个数量级。测试案例包括来自OpenML的分类基准测试、基于神经算子进行函数空间映射的采样,以及使用经典架构的迁移学习。