A Unified and Constructive Framework for the Universality of Neural Networks

One of the reasons why many neural networks are capable of replicating complicated tasks or functions is their universal property. Though the past few decades have seen tremendous advances in theories of neural networks, a single constructive framework for neural network universality remains unavailable. This paper is the first effort to provide a unified and constructive framework for the universality of a large class of activation functions including most of existing ones. At the heart of the framework is the concept of neural network approximate identity (nAI). The main result is: {\em any nAI activation function is universal}. It turns out that most of existing activation functions are nAI, and thus universal in the space of continuous functions on compacta. The framework induces {\bf several advantages} over the contemporary counterparts. First, it is constructive with elementary means from functional analysis, probability theory, and numerical analysis. Second, it is the first unified attempt that is valid for most of existing activation functions. Third, as a by product, the framework provides the first universality proof for some of the existing activation functions including Mish, SiLU, ELU, GELU, and etc. Fourth, it provides new proofs for most activation functions. Fifth, it discovers new activation functions with guaranteed universality property. Sixth, for a given activation and error tolerance, the framework provides precisely the architecture of the corresponding one-hidden neural network with predetermined number of neurons, and the values of weights/biases. Seventh, the framework allows us to abstractly present the first universal approximation with favorable non-asymptotic rate.

翻译：许多神经网络能够复现复杂任务或函数的原因之一在于其普适性。尽管过去几十年神经网络理论取得了巨大进展，但针对神经网络普适性的统一构造性框架仍付之阙如。本文是首个为包含现有大多数激活函数在内的广泛类别提供统一且构造性框架的研究。该框架的核心是神经网络近似恒等算子（neural network approximate identity, nAI）概念。主要结果为：{\em 任何nAI激活函数均具有普适性}。事实证明，现有大多数激活函数均为nAI，因此在紧集上的连续函数空间中具有普适性。该框架相较于现有方法具有{\bf 多项优势}。第一，它利用泛函分析、概率论和数值分析的基本工具实现构造性；第二，它是首个适用于大多数现有激活函数的统一尝试；第三，作为副产品，该框架首次为包括Mish、SiLU、ELU、GELU等在内的部分现有激活函数提供了普适性证明；第四，它为大多数激活函数提供了新证明；第五，它发现了具有可证明普适性的新型激活函数；第六，对于给定激活函数和误差容限，该框架可精确给出具有预定神经元数量的单隐层神经网络架构及权重/偏置值；第七，该框架使我们能抽象地呈现首个具有良好非渐近速率的普适逼近结果。