Universal approximation theorems provide a mathematical explanation for the expressive power of neural networks. They assert that, under mild conditions on the activation function, feedforward neural networks are dense in broad function classes, such as continuous functions on compact subsets of $\mathbb{R}^d$, $L^p$ spaces, or Sobolev spaces. Over the past four decades, these qualitative universality results have evolved into a rich quantitative theory addressing approximation rates, parameter efficiency, and the role of architectural features such as depth and width. This survey presents several glimpses into this theory. We review classical density results for single-hidden-layer networks, as well as quantitative bounds that relate approximation error to network size and smoothness assumptions on target functions. Particular emphasis is placed on depth--width trade-offs and on results demonstrating that deeper architectures can achieve superior parameter efficiency for structured function classes. In addition to standard feedforward neural networks, we also review recent developments on Kolmogorov--Arnold Networks (KANs), which offer an alternative architectural paradigm and whose approximation-theoretic properties have begun to attract significant theoretical attention.
翻译:通用逼近定理为神经网络的表达能力提供了数学解释。这些定理表明,在激活函数满足温和条件的情况下,前馈神经网络在广泛函数类(如$\mathbb{R}^d$紧子集上的连续函数、$L^p$空间或Sobolev空间)中具有稠密性。过去四十年间,这些定性普适性结论已发展为丰富的定量理论,涉及逼近速率、参数效率以及深度和宽度等架构特征的作用。本文综述对该理论进行了多视角考察。我们回顾了单隐层网络的经典稠密性结果,以及将逼近误差与网络规模及目标函数光滑性假设相关联的定量界。重点探讨了深度-宽度权衡问题,以及深层次架构可在结构化函数类中实现更优参数效率的结论。除标准前馈神经网络外,我们还综述了Kolmogorov-Arnold网络(KANs)的最新进展——该网络提供了替代性架构范式,其逼近理论性质已开始引起理论界的显著关注。