Sampling weights of deep neural networks

We introduce a probability distribution, combined with an efficient sampling algorithm, for weights and biases of fully-connected neural networks. In a supervised learning context, no iterative optimization or gradient computations of internal network parameters are needed to obtain a trained network. The sampling is based on the idea of random feature models. However, instead of a data-agnostic distribution, e.g., a normal distribution, we use both the input and the output training data of the supervised learning problem to sample both shallow and deep networks. We prove that the sampled networks we construct are universal approximators. We also show that our sampling scheme is invariant to rigid body transformations and scaling of the input data. This implies many popular pre-processing techniques are no longer required. For Barron functions, we show that the $L^2$-approximation error of sampled shallow networks decreases with the square root of the number of neurons. In numerical experiments, we demonstrate that sampled networks achieve comparable accuracy as iteratively trained ones, but can be constructed orders of magnitude faster. Our test cases involve a classification benchmark from OpenML, sampling of neural operators to represent maps in function spaces, and transfer learning using well-known architectures.

翻译：我们提出一种结合高效采样算法的概率分布，用于全连接神经网络的权重与偏置。在监督学习框架下，无需对内部网络参数进行迭代优化或梯度计算即可获得训练好的网络。该采样方法基于随机特征模型思想，但与采用正态分布等数据无关的分布不同，我们同时利用监督学习问题的输入与输出训练数据对浅层及深度网络进行采样。我们证明了所构造的采样网络具有通用逼近性质，并揭示了该采样方案对输入数据的刚体变换与缩放具有不变性，这意味着许多常用的预处理技术不再必要。针对Barron函数族，我们证明采样浅层网络的$L^2$逼近误差随神经元数量的平方根递减。数值实验表明，采样网络在达到与迭代训练网络相当精度的同时，其构建速度可提升数个数量级。测试案例涵盖OpenML分类基准测试、函数空间映射的神经算子采样，以及基于经典架构的迁移学习。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日