Müntz-Szász Networks: Neural Architectures with Learnable Power-Law Bases

from arxiv, V3: Corrected Full Müntz Theorem (added constant function), fixed L2 projection error formula, clarified MLP bounds in terms of linear pieces. Acknowledgments added. Full code at https://github.com/ReFractals/muntz-szasz-networks

Standard neural network architectures employ fixed activation functions (ReLU, tanh, sigmoid) that are poorly suited for approximating functions with singular or fractional power behavior, a structure that arises ubiquitously in physics, including boundary layers, fracture mechanics, and corner singularities. We introduce Müntz-Szász Networks (MSN), a novel architecture that replaces fixed smooth activations with learnable fractional power bases grounded in classical approximation theory. Each MSN edge computes $φ(x) = \sum_k a_k |x|^{μ_k} + \sum_k b_k \mathrm{sign}(x)|x|^{λ_k}$, where the exponents $\{μ_k, λ_k\}$ are learned alongside the coefficients. We prove that MSN inherits universal approximation from the Müntz-Szász theorem and establish novel approximation rates: for functions of the form $|x|^α$, MSN achieves error $\mathcal{O}(|μ- α|^2)$ with a single learned exponent, whereas standard MLPs require $\mathcal{O}(ε^{-1/α})$ neurons for comparable accuracy. On supervised regression with singular target functions, MSN achieves 5-8x lower error than MLPs with 10x fewer parameters. Physics-informed neural networks (PINNs) represent a particularly demanding application for singular function approximation; on PINN benchmarks including a singular ODE and stiff boundary-layer problems, MSN achieves 3-6x improvement while learning interpretable exponents that match the known solution structure. Our results demonstrate that theory-guided architectural design can yield dramatic improvements for scientifically-motivated function classes.

翻译：标准神经网络架构采用固定的激活函数（ReLU、tanh、sigmoid），这些函数不适合逼近具有奇异性或分数幂行为的函数，而这种结构在物理学中普遍存在，包括边界层、断裂力学和角点奇异性。我们提出了Müntz-Szász网络（MSN），这是一种新颖的架构，它用基于经典逼近理论的可学习分数幂基替换了固定的平滑激活函数。MSN的每条边计算$φ(x) = \sum_k a_k |x|^{μ_k} + \sum_k b_k \mathrm{sign}(x)|x|^{λ_k}$，其中指数$\{μ_k, λ_k\}$与系数一同学习。我们证明了MSN继承了Müntz-Szász定理的通用逼近能力，并建立了新的逼近速率：对于形式为$|x|^α$的函数，MSN使用单个学习到的指数即可实现$\mathcal{O}(|μ- α|^2)$的误差，而标准MLP需要$\mathcal{O}(ε^{-1/α})$个神经元才能达到相当的精度。在具有奇异目标函数的监督回归任务中，MSN以少10倍的参数实现了比MLP低5-8倍的误差。物理信息神经网络（PINNs）是奇异函数逼近中一个要求特别高的应用；在包括奇异ODE和刚性边界层问题在内的PINN基准测试中，MSN实现了3-6倍的性能提升，同时学习到的可解释指数与已知解的结构相匹配。我们的结果表明，理论指导的架构设计可以为科学驱动的函数类带来显著的性能改进。

相关内容

MSN

关注 0

MSN：International Conference on Mobile Ad-hoc and Sensor Networks。 Explanation：移动自组织和传感器网络国际会议。 Publisher：IEEE。 SIT： https://dblp.uni-trier.de/db/conf/msn/

【斯坦福博士论文】凸神经网络，Convex neural networks，228页pdf

专知会员服务

53+阅读 · 2023年11月19日

经典算法与神经网络如何结合？德国康斯坦茨大学Felix Petersen《可微算法学习》博士论文，162页pdf

专知会员服务

69+阅读 · 2022年9月12日

神经网络的基础数学

专知会员服务

208+阅读 · 2022年1月23日

图神经网络架构，稳定性，可迁移性

专知会员服务

29+阅读 · 2020年8月8日