The paper contains approximation guarantees for neural networks that are trained with gradient flow, with error measured in the continuous $L_2(\mathbb{S}^{d-1})$-norm on the $d$-dimensional unit sphere and targets that are Sobolev smooth. The networks are fully connected of constant depth and increasing width. Although all layers are trained, the gradient flow convergence is based on a neural tangent kernel (NTK) argument for the non-convex second but last layer. Unlike standard NTK analysis, the continuous error norm implies an under-parametrized regime, possible by the natural smoothness assumption required for approximation. The typical over-parametrization re-enters the results in form of a loss in approximation rate relative to established approximation methods for Sobolev smooth functions.
翻译:本文给出了通过梯度流训练的神经网络的逼近保证,误差以$d$维单位球面上的连续$L_2(\mathbb{S}^{d-1})$范数度量,目标函数具有Sobolev光滑性。所研究的网络为常数深度、宽度递增的全连接网络。尽管所有层均参与训练,但梯度流收敛性基于针对非凸的倒数第二层所采用的神经正切核(NTK)论证。与标准NTK分析不同,连续的误差范数意味着欠参数化情形,这得益于逼近所需的自然光滑性假设。与已建立的Sobolev光滑函数逼近方法相比,典型的过参数化现象以逼近速率损失的形式重新体现在结果中。