Motivated by the developing mathematics of deep learning, we build universal functions approximators of continuous maps between arbitrary Polish metric spaces $\mathcal{X}$ and $\mathcal{Y}$ using elementary functions between Euclidean spaces as building blocks. Earlier results assume that the target space $\mathcal{Y}$ is a topological vector space. We overcome this limitation by ``randomization'': our approximators output discrete probability measures over $\mathcal{Y}$. When $\mathcal{X}$ and $\mathcal{Y}$ are Polish without additional structure, we prove very general qualitative guarantees; when they have suitable combinatorial structure, we prove quantitative guarantees for H\"{o}lder-like maps, including maps between finite graphs, solution operators to rough differential equations between certain Carnot groups, and continuous non-linear operators between Banach spaces arising in inverse problems. In particular, we show that the required number of Dirac measures is determined by the combinatorial structure of $\mathcal{X}$ and $\mathcal{Y}$. For barycentric $\mathcal{Y}$, including Banach spaces, $\mathbb{R}$-trees, Hadamard manifolds, or Wasserstein spaces on Polish metric spaces, our approximators reduce to $\mathcal{Y}$-valued functions. When the Euclidean approximators are neural networks, our constructions generalize transformer networks, providing a new probabilistic viewpoint of geometric deep learning.
翻译:受深度学习数学发展的启发,我们利用欧几里得空间之间的初等函数作为构建模块,构造了任意波兰度量空间$\mathcal{X}$和$\mathcal{Y}$之间连续映射的通用函数逼近器。此前的结果假设目标空间$\mathcal{Y}$是拓扑向量空间。我们通过"随机化"克服了这一限制:我们的逼近器输出关于$\mathcal{Y}$的离散概率测度。当$\mathcal{X}$和$\mathcal{Y}$为无附加结构的波兰空间时,我们证明了非常一般的定性保证;当它们具有适当的组合结构时,我们证明了对于类Hölder映射(包括有限图之间的映射、某些Carnot群上粗糙微分方程的解算子、以及反问题中Banach空间之间的连续非线性算子)的定量保证。特别地,我们表明所需Dirac测度的数量由$\mathcal{X}$和$\mathcal{Y}$的组合结构决定。对于重心型$\mathcal{Y}$(包括Banach空间、$\mathbb{R}$-树、Hadamard流形或波兰度量空间上的Wasserstein空间),我们的逼近器可简化为$\mathcal{Y}$值函数。当欧几里得逼近器为神经网络时,我们的构造推广了Transformer网络,为几何深度学习提供了新的概率论视角。