Optimal Approximation Rates for Deep ReLU Neural Networks on Sobolev and Besov Spaces

Let $\Omega = [0,1]^d$ be the unit cube in $\mathbb{R}^d$. We study the problem of how efficiently, in terms of the number of parameters, deep neural networks with the ReLU activation function can approximate functions in the Sobolev spaces $W^s(L_q(\Omega))$ and Besov spaces $B^s_r(L_q(\Omega))$, with error measured in the $L_p(\Omega)$ norm. This problem is important when studying the application of neural networks in a variety of fields, including scientific computing and signal processing, and has previously been solved only when $p=q=\infty$. Our contribution is to provide a complete solution for all $1\leq p,q\leq \infty$ and $s > 0$ for which the corresponding Sobolev or Besov space compactly embeds into $L_p$. The key technical tool is a novel bit-extraction technique which gives an optimal encoding of sparse vectors. This enables us to obtain sharp upper bounds in the non-linear regime where $p > q$. We also provide a novel method for deriving $L_p$-approximation lower bounds based upon VC-dimension when $p < \infty$. Our results show that very deep ReLU networks significantly outperform classical methods of approximation in terms of the number of parameters, but that this comes at the cost of parameters which are not encodable.

翻译：设 $\Omega = [0,1]^d$ 为 $\mathbb{R}^d$ 中的单位立方体。我们研究在参数数量方面，使用ReLU激活函数的深度神经网络能够以多高的效率逼近Sobolev空间 $W^s(L_q(\Omega))$ 和Besov空间 $B^s_r(L_q(\Omega))$ 中的函数，并以 $L_p(\Omega)$ 范数度量误差。这一问题在神经网络应用于科学计算和信号处理等多个领域时至关重要，且此前仅在 $p=q=\infty$ 的情况下得到解决。我们的贡献在于为所有满足相应Sobolev或Besov空间紧嵌入 $L_p$ 的 $1\leq p,q\leq \infty$ 和 $s > 0$ 提供了完整解答。关键技术工具是一种新颖的位提取技术，它能够对稀疏向量进行最优编码，从而使我们能够在 $p > q$ 的非线性区域内获得尖锐上界。我们还提出了一种基于VC维度的推导 $L_p$ 逼近下界的新方法（当 $p < \infty$ 时）。我们的结果表明，极深的ReLU网络在参数数量方面显著优于经典逼近方法，但这是以参数不可编码为代价的。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日