Since statistical guarantees for neural networks are usually restricted to global optima of intricate objective functions, it is unclear whether these theories explain the performances of actual outputs of neural network pipelines. The goal of this paper is, therefore, to bring statistical theory closer to practice. We develop statistical guarantees for shallow linear neural networks that coincide up to logarithmic factors with the global optima but apply to stationary points and the points nearby. These results support the common notion that neural networks do not necessarily need to be optimized globally from a mathematical perspective. We then extend our statistical guarantees to shallow ReLU neural networks, assuming the first layer weight matrices are nearly identical for the stationary network and the target. More generally, despite being limited to shallow neural networks for now, our theories make an important step forward in describing the practical properties of neural networks in mathematical terms.
翻译:由于神经网络的统计保证通常仅限于复杂目标函数的全局最优解,这些理论是否能够解释神经网络实际输出性能尚不明确。因此,本文的目标是将统计理论更贴近实践。我们为浅层线性神经网络建立了统计保证,这些保证在忽略对数因子后与全局最优解一致,但适用于驻点及其邻近点。这些结果从数学角度支持了神经网络无需全局优化的普遍观点。随后,我们将统计保证扩展至浅层ReLU神经网络,假设驻点网络与目标网络的第一层权重矩阵近乎相同。更广泛而言,尽管目前仅限于浅层神经网络,我们的理论在利用数学术语描述神经网络实际特性方面迈出了重要一步。