Operator learning based on neural operators has emerged as a promising paradigm for the data-driven approximation of operators, mapping between infinite-dimensional Banach spaces. Despite significant empirical progress, our theoretical understanding regarding the efficiency of these approximations remains incomplete. This work addresses the parametric complexity of neural operator approximations for the general class of Lipschitz continuous operators. Motivated by recent findings on the limitations of specific architectures, termed curse of parametric complexity, we here adopt an information-theoretic perspective. Our main contribution establishes lower bounds on the metric entropy of Lipschitz operators in two approximation settings; uniform approximation over a compact set of input functions, and approximation in expectation, with input functions drawn from a probability measure. It is shown that these entropy bounds imply that, regardless of the activation function used, neural operator architectures attaining an approximation accuracy $\epsilon$ must have a size that is exponentially large in $\epsilon^{-1}$. The size of architectures is here measured by counting the number of encoded bits necessary to store the given model in computational memory. The results of this work elucidate fundamental trade-offs and limitations in
翻译:基于神经算子的算子学习已成为一种有前景的范式,用于数据驱动的算子逼近,这些算子映射于无限维巴拿赫空间之间。尽管经验研究取得了显著进展,但我们关于这些逼近效率的理论理解仍不完整。本研究针对一般利普希茨连续算子类的神经算子逼近的参数复杂度问题展开探讨。受近期关于特定架构局限性(称为参数复杂度灾难)的研究发现启发,本文采用信息论视角。我们的主要贡献在于建立了利普希茨算子在两种逼近设定下的度量熵下界:在输入函数紧集上的一致逼近,以及在输入函数服从概率分布时的期望逼近。研究表明,这些熵界意味着无论使用何种激活函数,要达到$\epsilon$逼近精度的神经算子架构,其规模必须随$\epsilon^{-1}$呈指数级增长。此处架构规模通过计算内存中存储给定模型所需的编码比特数来衡量。本研究结果阐明了算子学习中的基本权衡与局限性。