We provide several new results on the sample complexity of vector-valued linear predictors (parameterized by a matrix), and more generally neural networks. Focusing on size-independent bounds, where only the Frobenius norm distance of the parameters from some fixed reference matrix $W_0$ is controlled, we show that the sample complexity behavior can be surprisingly different than what we may expect considering the well-studied setting of scalar-valued linear predictors. This also leads to new sample complexity bounds for feed-forward neural networks, tackling some open questions in the literature, and establishing a new convex linear prediction problem that is provably learnable without uniform convergence.
翻译:我们针对向量值线性预测器(以矩阵为参数)以及更一般的神经网络,提供了若干关于样本复杂度的新结果。聚焦于与规模无关的界(仅控制参数与某个固定参考矩阵 $W_0$ 之间的Frobenius范数距离),我们展示了样本复杂度的行为可能与基于已广泛研究的标量值线性预测器所预期的结果存在显著差异。这一结果进一步推动了前馈神经网络样本复杂度新界的导出,解决了文献中一些悬而未决的问题,并建立了一个无需一致收敛即可可证明学习的全新凸线性预测问题。