In this article, we show existence of minimizers in the loss landscape for residual artificial neural networks (ANNs) with multi-dimensional input layer and one hidden layer with ReLU activation. Our work contrasts earlier results in [D. Gallon, A. Jentzen, and F. Lindner, preprint, arXiv:2211.15641, 2022] and [P. Petersen, M. Raslan, and F. Voigtlaender, Found. Comput. Math., 21 (2021), pp. 375-444] which showed that in many situations minimizers do not exist for common smooth activation functions even in the case where the target functions are polynomials. The proof of the existence property makes use of a closure of the search space containing all functions generated by ANNs and additional discontinuous generalized responses. As we will show, the additional generalized responses in this larger space are suboptimal so that the minimum is attained in the original function class.
翻译:本文证明了具有多维输入层和单隐藏层(采用ReLU激活函数)的残差人工神经网络在损失景观中存在极小值。我们的研究结果与[D. Gallon, A. Jentzen, and F. Lindner, preprint, arXiv:2211.15641, 2022]及[P. Petersen, M. Raslan, and F. Voigtlaender, Found. Comput. Math., 21 (2021), pp. 375-444]中较早的结论形成对比——后者表明,即使目标函数为多项式,在许多情况下采用常见光滑激活函数的网络仍不存在极小值。存在性证明的关键在于构建一个搜索空间的闭包,该空间包含所有由人工神经网络生成的函数以及额外的间断广义响应函数。我们将证明,在此扩展空间中引入的广义响应函数均非最优解,因此最小值必然在原函数类中实现。