Neural Network Approximation of Continuous Functions in High Dimensions with Applications to Inverse Problems

The remarkable successes of neural networks in a huge variety of inverse problems have fueled their adoption in disciplines ranging from medical imaging to seismic analysis over the past decade. However, the high dimensionality of such inverse problems has simultaneously left current theory, which predicts that networks should scale exponentially in the dimension of the problem, unable to explain why the seemingly small networks used in these settings work as well as they do in practice. To reduce this gap between theory and practice, we provide a general method for bounding the complexity required for a neural network to approximate a H\"older (or uniformly) continuous function defined on a high-dimensional set with a low-complexity structure. The approach is based on the observation that the existence of a Johnson-Lindenstrauss embedding $A\in\mathbb{R}^{d\times D}$ of a given high-dimensional set $S\subset\mathbb{R}^D$ into a low dimensional cube $[-M,M]^d$ implies that for any H\"older (or uniformly) continuous function $f:S\to\mathbb{R}^p$, there exists a H\"older (or uniformly) continuous function $g:[-M,M]^d\to\mathbb{R}^p$ such that $g(Ax)=f(x)$ for all $x\in S$. Hence, if one has a neural network which approximates $g:[-M,M]^d\to\mathbb{R}^p$, then a layer can be added that implements the JL embedding $A$ to obtain a neural network that approximates $f:S\to\mathbb{R}^p$. By pairing JL embedding results along with results on approximation of H\"older (or uniformly) continuous functions by neural networks, one then obtains results which bound the complexity required for a neural network to approximate H\"older (or uniformly) continuous functions on high dimensional sets. The end result is a general theoretical framework which can then be used to better explain the observed empirical successes of smaller networks in a wider variety of inverse problems than current theory allows.

翻译：神经网络在各类反问题中的显著成功，推动了其过去十年间在医学成像、地震分析等领域的广泛应用。然而，这些反问题的高维特性使得当前理论——预测网络规模应随问题维度呈指数级增长——无法解释实践中看似规模较小的网络为何能取得良好效果。为弥合理论与实践的鸿沟，我们提出一种通用方法，用于界定神经网络逼近定义在具有低维结构的高维集合上的赫尔德（或一致）连续函数所需的复杂度。该方法的立足点在于：若给定高维集合$S\subset\mathbb{R}^D$存在约翰逊-林登斯特劳斯嵌入$A\in\mathbb{R}^{d\times D}$映射至低维立方体$[-M,M]^d$，则对任意赫尔德（或一致）连续函数$f:S\to\mathbb{R}^p$，存在赫尔德（或一致）连续函数$g:[-M,M]^d\to\mathbb{R}^p$，使得对所有$x\in S$满足$g(Ax)=f(x)$。因此，若已有神经网络能逼近$g:[-M,M]^d\to\mathbb{R}^p$，则可通过添加实现JL嵌入$A$的层，构造出逼近$f:S\to\mathbb{R}^p$的神经网络。通过结合JL嵌入结果与神经网络逼近赫尔德（或一致）连续函数的理论，可获得界定高维集合上赫尔德（或一致）连续函数神经网络逼近复杂度的结论。最终建立的理论框架，能比现有理论更好地解释较小网络在更广泛反问题中展现的实证成功。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日