Operator-learning systems are not governed solely by total parameter count; for one query, the relevant bottleneck can be the model that must be loaded and evaluated. We study this distinction for classical neural operators on compact Sobolev subsets through a constructive comparison between routed mixtures of neural operators (MoNOs) and a fixed single-neural-operator construction. The comparison concerns expert-active complexity relative to that baseline, with total stored size and routing search accounted separately. A MoNO routes each input function through a tree to one expert. Our main theorem shows that every scalar uniformly continuous nonlinear operator with bounded output Sobolev radius on the approximation set admits a MoNO approximation whose active expert has smaller depth, width, and rank scaling than the analyzed single-neural-operator construction; for Lipschitz targets these expert quantities are bounded by $\mathcal{O}(\varepsilon^{-1})$. The theorem turns localization into an operator-level accounting of active expert size, routing depth, and number of experts. We also prove a quantitative universal approximation theorem for the underlying neural-operator architecture, with explicit dependence on compact-set diameter and modulus of continuity.
翻译:算子学习系统的性能并不完全由总参数数量决定;对于单次查询而言,相关瓶颈在于必须加载和评估的模型。我们通过在紧致Sobolev子集上对经典神经算子进行构造性比较,研究了路由混合神经算子(MoNOs)与固定单一神经算子之间的这一区别。该比较涉及相对于基准的专家主动复杂性,并单独考虑总存储大小和路由搜索成本。MoNO通过树形结构将每个输入函数路由至一个专家。我们的主要定理表明:逼近集上任意具有有界输出Sobolev半径的标量一致连续非线性算子,均可通过MoNO逼近,且其活跃专家的深度、宽度和秩标度均小于所分析的单一神经算子构造;对于Lipschitz目标,这些专家量级被限定为$\mathcal{O}(\varepsilon^{-1})$。该定理将局部化转化为算子层面关于活跃专家规模、路由深度和专家数量的核算。我们还证明了底层神经算子架构的量化万有逼近定理,其中明确依赖于紧集直径和连续性模量。