We study the sample complexity of learning ReLU neural networks from the point of view of generalization. Given norm constraints on the weight matrices, a common approach is to estimate the Rademacher complexity of the associated function class. Previously Golowich-Rakhlin-Shamir (2020) obtained a bound independent of the network size (scaling with a product of Frobenius norms) except for a factor of the square-root depth. We give a refinement which often has no explicit depth-dependence at all.
翻译:我们从泛化角度研究学习ReLU神经网络的样本复杂度。在权重矩阵存在范数约束的条件下,常用方法是对关联函数类的Rademacher复杂度进行估计。此前Golowich-Rakhlin-Shamir(2020)获得的界与网络尺寸无关(按Frobenius范数乘积缩放),但存在平方根深度因子。我们给出了一个改进版本,该版本通常完全没有显式深度依赖性。