We provide several new results on the sample complexity of vector-valued linear predictors (parameterized by a matrix), and more generally neural networks. Focusing on size-independent bounds, where only the Frobenius norm distance of the parameters from some fixed reference matrix $W_0$ is controlled, we show that the sample complexity behavior can be surprisingly different than what we may expect considering the well-studied setting of scalar-valued linear predictors. This also leads to new sample complexity bounds for feed-forward neural networks, tackling some open questions in the literature, and establishing a new convex linear prediction problem that is provably learnable without uniform convergence.
翻译:我们提出了关于向量值线性预测器(由矩阵参数化)及更一般神经网络样本复杂度的若干新结果。聚焦于与参数到固定参考矩阵$W_0$的弗罗贝尼乌斯范数距离相关的尺度无关界,我们证明了样本复杂度行为可能与我们基于标量值线性预测器成熟研究背景下的预期截然不同。这一发现进一步导出前馈神经网络样本复杂度的新界,解决了文献中的若干开放性问题,并建立了一个可证明无需一致性收敛即可学习的凸线性预测问题。