Near-optimal learning of Banach-valued, high-dimensional functions via deep neural networks

The past decade has seen increasing interest in applying Deep Learning (DL) to Computational Science and Engineering (CSE). Driven by impressive results in applications such as computer vision, Uncertainty Quantification (UQ), genetics, simulations and image processing, DL is increasingly supplanting classical algorithms, and seems poised to revolutionize scientific computing. However, DL is not yet well-understood from the standpoint of numerical analysis. Little is known about the efficiency and reliability of DL from the perspectives of stability, robustness, accuracy, and sample complexity. In particular, approximating solutions to parametric PDEs is an objective of UQ for CSE. Training data for such problems is often scarce and corrupted by errors. Moreover, the target function is a possibly infinite-dimensional smooth function taking values in the PDE solution space, generally an infinite-dimensional Banach space. This paper provides arguments for Deep Neural Network (DNN) approximation of such functions, with both known and unknown parametric dependence, that overcome the curse of dimensionality. We establish practical existence theorems that describe classes of DNNs with dimension-independent architecture size and training procedures based on minimizing the (regularized) $\ell^2$-loss which achieve near-optimal algebraic rates of convergence. These results involve key extensions of compressed sensing for Banach-valued recovery and polynomial emulation with DNNs. When approximating solutions of parametric PDEs, our results account for all sources of error, i.e., sampling, optimization, approximation and physical discretization, and allow for training high-fidelity DNN approximations from coarse-grained sample data. Our theoretical results fall into the category of non-intrusive methods, providing a theoretical alternative to classical methods for high-dimensional approximation.

翻译：近十年来，深度学习在计算科学与工程领域的应用日益受到关注。受计算机视觉、不确定性量化、遗传学、仿真及图像处理等应用中显著成果的驱动，深度学习正逐步取代经典算法，并有望彻底改变科学计算格局。然而从数值分析视角来看，深度学习尚未被充分理解。在稳定性、鲁棒性、精度及样本复杂度等方面，人们对深度学习的效率与可靠性知之甚少。特别地，逼近参数化偏微分方程的解是不确定性量化在计算科学与工程中的目标之一。此类问题的训练数据往往稀缺且存在误差，同时目标函数可能是取值于偏微分方程解空间（通常为无限维巴拿赫空间）的无限维光滑函数。本文为深度神经网络逼近此类函数（含已知与未知参数依赖关系）提供了克服维数灾难的理论依据。我们建立了实用存在性定理，描述了具有与维数无关的架构规模的深度神经网络类，以及基于最小化（正则化）$\ell^2$损失的训练过程，这些方法可实现近最优代数收敛速度。这些结果涉及巴拿赫值恢复中压缩感知的关键扩展，以及基于深度神经网络的多项式模拟。在逼近参数化偏微分方程解时，我们的结果涵盖了所有误差来源（即采样、优化、逼近与物理离散化），并支持从粗粒度样本数据训练高保真深度神经网络近似。理论成果属于非侵入式方法范畴，为高维逼近问题提供了经典方法的理论替代方案。