Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks

Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks as the curse of dimensionality (CoD) cannot be evaded when trying to approximate even a single ReLU neuron (Bach, 2017). In this paper, we study a suitable function space for over-parameterized two-layer neural networks with bounded norms (e.g., the path norm, the Barron norm) in the perspective of sample complexity and generalization properties. First, we show that the path norm (as well as the Barron norm) is able to obtain width-independence sample complexity bounds, which allows for uniform convergence guarantees. Based on this result, we derive the improved result of metric entropy for $\epsilon$-covering up to $\mathcal{O}(\epsilon^{-\frac{2d}{d+2}})$ ($d$ is the input dimension and the depending constant is at most polynomial order of $d$) via the convex hull technique, which demonstrates the separation with kernel methods with $\Omega(\epsilon^{-d})$ to learn the target function in a Barron space. Second, this metric entropy result allows for building a sharper generalization bound under a general moment hypothesis setting, achieving the rate at $\mathcal{O}(n^{-\frac{d+2}{2d+2}})$. Our analysis is novel in that it offers a sharper and refined estimation for metric entropy (with a clear dependence relationship on the dimension $d$) and unbounded sampling in the estimation of the sample error and the output error.

翻译：近期研究表明，再生核希尔伯特空间（RKHS）并不适合用于神经网络建模函数，因为即使在逼近单个ReLU神经元时也无法规避维度灾难（CoD）（Bach, 2017）。本文从样本复杂度与泛化性质的角度，研究了具有有界范数（如路径范数、Barron范数）的过参数化两层神经网络的适用函数空间。首先，我们证明路径范数（以及Barron范数）能够获得与宽度无关的样本复杂度界，从而实现一致收敛保障。基于该结果，我们通过凸包技术改进了度量熵的估计结果，实现ε-覆盖复杂度达到$\mathcal{O}(\epsilon^{-\frac{2d}{d+2}})$（$d$为输入维度，依赖常数至多为$d$的多项式阶），这表明在Barron空间中学习目标函数时与核方法$\Omega(\epsilon^{-d})$的复杂度存在本质差异。其次，该度量熵结果允许在一般矩假设条件下建立更紧的泛化界，其收敛速率达到$\mathcal{O}(n^{-\frac{d+2}{2d+2}})$。本分析的创新之处在于：提供了更精确的度量熵估计（明确呈现与维度$d$的依赖关系），并在估计样本误差与输出误差时考虑了无界采样情形。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日