We study the structural and statistical properties of $\mathcal{R}$-norm minimizing interpolants of datasets labeled by specific target functions. The $\mathcal{R}$-norm is the basis of an inductive bias for two-layer neural networks, recently introduced to capture the functional effect of controlling the size of network weights, independently of the network width. We find that these interpolants are intrinsically multivariate functions, even when there are ridge functions that fit the data, and also that the $\mathcal{R}$-norm inductive bias is not sufficient for achieving statistically optimal generalization for certain learning problems. Altogether, these results shed new light on an inductive bias that is connected to practical neural network training.
翻译:我们研究了由特定目标函数标注的数据集的$\mathcal{R}$-范数最小化插值函数的结构性质与统计性质。$\mathcal{R}$-范数是双层神经网络归纳偏置的基础,近期被引入以捕捉独立于网络宽度的网络权重大小控制所产生的函数效应。我们发现,即便存在能够拟合数据的岭函数,这些插值函数本质上是多变量函数;此外,对于某些学习问题,$\mathcal{R}$-范数归纳偏置不足以实现统计最优的泛化性能。总体而言,这些结果揭示了这一与实用神经网络训练相关的归纳偏置的新特性。