Deep neural networks have achieved remarkable success in diverse applications, prompting the need for a solid theoretical foundation. Recent research has identified the minimal width $\max\{2,d_x,d_y\}$ required for neural networks with input dimensions $d_x$ and output dimension $d_y$ that use leaky ReLU activations to universally approximate $L^p(\mathbb{R}^{d_x},\mathbb{R}^{d_y})$ on compacta. Here, we present an alternative proof for the minimal width of such neural networks, by directly constructing approximating networks using a coding scheme that leverages the properties of leaky ReLUs and standard $L^p$ results. The obtained construction has a minimal interior dimension of $1$, independent of input and output dimensions, which allows us to show that autoencoders with leaky ReLU activations are universal approximators of $L^p$ functions. Furthermore, we demonstrate that the normalizing flow LU-Net serves as a distributional universal approximator. We broaden our results to show that smooth invertible neural networks can approximate $L^p(\mathbb{R}^{d},\mathbb{R}^{d})$ on compacta when the dimension $d\geq 2$, which provides a constructive proof of a classical theorem of Brenier and Gangbo. In addition, we use a topological argument to establish that for FNNs with monotone Lipschitz continuous activations, $d_x+1$ is a lower bound on the minimal width required for the uniform universal approximation of continuous functions $C^0(\mathbb{R}^{d_x},\mathbb{R}^{d_y})$ on compacta when $d_x\geq d_y$.
翻译:深度神经网络在众多应用中取得了显著成功,这促使人们需要建立坚实的理论基础。最近的研究确定了使用 leaky ReLU 激活函数、输入维度为 $d_x$ 且输出维度为 $d_y$ 的神经网络在紧集上通用逼近 $L^p(\mathbb{R}^{d_x},\mathbb{R}^{d_y})$ 所需的最小宽度为 $\max\{2,d_x,d_y\}$。本文通过直接构造逼近网络,提出了一种替代证明方法,该方法利用 leaky ReLU 的性质和标准的 $L^p$ 结果,采用了一种编码方案。所获得的构造具有与输入和输出维度无关的最小内部维度 $1$,这使我们能够证明具有 leaky ReLU 激活函数的自编码器是 $L^p$ 函数的通用逼近器。此外,我们证明了归一化流 LU-Net 可作为分布通用逼近器。我们进一步扩展了结果,表明当维度 $d\geq 2$ 时,光滑可逆神经网络可以在紧集上逼近 $L^p(\mathbb{R}^{d},\mathbb{R}^{d})$,这为 Brenier 和 Gangbo 的一个经典定理提供了构造性证明。此外,我们使用拓扑论证证明,对于具有单调 Lipschitz 连续激活函数的前馈神经网络,当 $d_x\geq d_y$ 时,在紧集上一致通用逼近连续函数 $C^0(\mathbb{R}^{d_x},\mathbb{R}^{d_y})$ 所需的最小宽度下界为 $d_x+1$。