Function approximation based on data drawn randomly from an unknown distribution is an important problem in machine learning. In contrast to the prevalent paradigm of solving this problem by minimizing a loss functional, we have given a direct one-shot construction together with optimal error bounds under the manifold assumption; i.e., one assumes that the data is sampled from an unknown sub-manifold of a high dimensional Euclidean space. A great deal of research deals with obtaining information about this manifold, such as the eigendecomposition of the Laplace-Beltrami operator or coordinate charts, and using this information for function approximation. This two step approach implies some extra errors in the approximation stemming from basic quantities of the data in addition to the errors inherent in function approximation. In Neural Networks, 132:253268, 2020, we have proposed a one-shot direct method to achieve function approximation without requiring the extraction of any information about the manifold other than its dimension. However, one cannot pin down the class of approximants used in that paper. In this paper, we view the unknown manifold as a sub-manifold of an ambient hypersphere and study the question of constructing a one-shot approximation using the spherical polynomials based on the hypersphere. Our approach does not require preprocessing of the data to obtain information about the manifold other than its dimension. We give optimal rates of approximation for relatively "rough" functions.
翻译:基于从未知分布随机抽取数据的函数逼近是机器学习中的重要问题。与通过最小化损失泛函解决该问题的主流范式不同,我们在流形假设下给出了直接一次性构造方法,并附带最优误差界;即假设数据采样自高维欧氏空间中的未知子流形。大量研究致力于获取该流形的信息(例如拉普拉斯-贝尔特拉米算子的特征分解或坐标图),并利用这些信息进行函数逼近。这种两步法除了函数逼近本身的误差外,还会引入由数据基本量导致的额外近似误差。在《Neural Networks, 132:253268, 2020》中,我们提出了一种一次性直接方法,无需提取流形的任何信息(除其维度外)即可实现函数逼近。然而,该论文无法具体确定所用逼近函数类。本文我们将未知流形视为环境超球面的子流形,并研究基于超球面球多项式构造一次性逼近的问题。我们的方法无需预处理数据以获取流形信息(仅需其维度)。针对相对"粗糙"函数,我们给出了最优逼近速率。