Radial basis function neural networks (\emph{RBFNN}) are {well-known} for their capability to approximate any continuous function on a closed bounded set with arbitrary precision given enough hidden neurons. In this paper, we introduce the first algorithm to construct coresets for \emph{RBFNNs}, i.e., small weighted subsets that approximate the loss of the input data on any radial basis function network and thus approximate any function defined by an \emph{RBFNN} on the larger input data. In particular, we construct coresets for radial basis and Laplacian loss functions. We then use our coresets to obtain a provable data subset selection algorithm for training deep neural networks. Since our coresets approximate every function, they also approximate the gradient of each weight in a neural network, which is a particular function on the input. We then perform empirical evaluations on function approximation and dataset subset selection on popular network architectures and data sets, demonstrating the efficacy and accuracy of our coreset construction.
翻译:径向基函数神经网络(RBFNN)以其在封闭有界集合上以任意精度逼近任意连续函数的能力而著称,前提是拥有足够数量的隐藏神经元。本文首次提出为RBFNN构建核心集(coresets)的算法,即能够在大规模输入数据上近似任何RBFNN定义函数的小型加权子集,从而近似输入数据在任意径向基函数网络上的损失。具体而言,我们针对径向基函数和拉普拉斯损失函数构建核心集,进而利用这些核心集获得一种可证明的数据子集选择算法,用于训练深度神经网络。由于我们的核心集能够逼近任意函数,因此也能逼近神经网络中每个权重的梯度——即输入上的特定函数。最后,我们在主流网络架构和数据集上进行了函数逼近及数据集子集选择的实证评估,验证了所构建核心集的有效性与准确性。