Kolmogorov叠加定理可打破高维函数逼近中的维度灾难 (The Kolmogorov Superposition Theorem can Break the Curse of Dimensionality When Approximating High Dimensional Functions)

We explain how to use Kolmogorov Superposition Theorem (KST) to break the curse of dimensionality when approximating a dense class of multivariate continuous functions. We first show that there is a class of functions called Kolmogorov-Lipschitz (KL) continuous in $C([0,1]^d)$ which can be approximated by a special ReLU neural network of two hidden layers with a dimension independent approximation rate $O(1/n)$ with approximation constant increasing quadratically in $d$. The number of parameters used in such neural network approximation equals to $(6d+2)n$. Next we introduce KB-splines by using linear B-splines to replace the outer function and smooth the KB-splines to have the so-called LKB-splines as the basis for approximation. Our numerical evidence shows that the curse of dimensionality is broken in the following sense: When using the standard discrete least squares (DLS) method to approximate a continuous function, there exists a pivotal set of points in $[0,1]^d$ with size at most $O(nd)$ such that the rooted mean squares error (RMSE) from the DLS based on the pivotal set is similar to the RMSE of the DLS based on the original set with size $O(n^d)$. The pivotal point set is chosen by using matrix cross approximation technique and the number of LKB-splines used for approximation is the same as the size of the pivotal data set. Therefore, we do not need too many basis functions nor too many function values to approximate a high dimensional continuous function $f$. Hence, the study in this paper provides an approach for dimension reduction problems.

翻译：本文阐释了如何利用Kolmogorov叠加定理（KST）在逼近稠密类多元连续函数时突破维度灾难。我们首先证明，在$C([0,1]^d)$中存在一类称为Kolmogorov-Lipschitz（KL）连续的函数，其可由具有两个隐藏层的特殊ReLU神经网络以与维度无关的逼近速率$O(1/n)$进行逼近，且逼近常数随$d$呈二次增长。此类神经网络逼近所使用的参数量为$(6d+2)n$。接着，我们通过用线性B样条替代外部函数引入KB样条，并将KB样条平滑化得到LKB样条作为逼近基函数。数值实验表明，维度灾难在以下意义上被打破：当使用标准离散最小二乘法（DLS）逼近连续函数时，存在$[0,1]^d$中规模至多为$O(nd)$的关键点集，使得基于该关键点集的DLS所得均方根误差（RMSE）与基于原始$O(n^d)$规模点集的DLS的RMSE相当。关键点集通过矩阵交叉逼近技术选取，且用于逼近的LKB样条数量与关键数据集规模相同。因此，我们无需过多基函数或过多函数值即可逼近高维连续函数$f$。本研究从而为维度约简问题提供了一种新途径。