We prove existence of global minima in the loss landscape for the approximation of continuous target functions using shallow feedforward artificial neural networks with ReLU activation. This property is one of the fundamental artifacts separating ReLU from other commonly used activation functions. We propose a kind of closure of the search space so that in the extended space minimizers exist. In a second step, we show under mild assumptions that the newly added functions in the extension perform worse than appropriate representable ReLU networks. This then implies that the optimal response in the extended target space is indeed the response of a ReLU network.
翻译:我们证明了在使用具有ReLU激活函数的浅层前馈人工神经网络逼近连续目标函数时,损失景观中全局极小值的存在性。这一特性是ReLU与其他常用激活函数之间的基本差异之一。我们提出了一种搜索空间的闭包构造,使得在扩展空间中存在极小化器。第二步,我们在温和假设下证明,扩展中新增的函数比合适的可表示ReLU网络性能更差。由此得出,扩展目标空间中的最优响应实际上就是ReLU网络的响应。