We propose nonuniform data-driven parameter distributions for neural network initialization based on derivative data of the function to be approximated. These parameter distributions are developed in the context of non-parametric regression models based on shallow neural networks, and compare favorably to well-established uniform random feature models based on conventional weight initialization. We address the cases of Heaviside and ReLU activation functions, and their smooth approximations (sigmoid and softplus), and use recent results on the harmonic analysis and sparse representation of neural networks resulting from fully trained optimal networks. Extending analytic results that give exact representation, we obtain densities that concentrate in regions of the parameter space corresponding to neurons that are well suited to model the local derivatives of the unknown function. Based on these results, we suggest simplifications of these exact densities based on approximate derivative data in the input points that allow for very efficient sampling and lead to performance of random feature models close to optimal networks in several scenarios.
翻译:我们提出了一种基于待逼近函数导数数据的神经网络初始化非均匀数据驱动参数分布方法。这些参数分布是在基于浅层神经网络的非参数回归模型框架下发展的,相比基于传统权重初始化的成熟均匀随机特征模型具有显著优势。我们分别处理了Heaviside和ReLU激活函数及其光滑近似(sigmoid和softplus)的情况,并运用了关于完全训练最优神经网络所得谐波分析与稀疏表示的最新研究成果。通过扩展精确表示的解析结果,我们获得了在参数空间中集中于特定区域的概率密度分布,这些区域对应的神经元非常适合建模未知函数的局部导数。基于这些结果,我们提出了基于输入点近似导数数据的精确密度简化方案,该方案支持高效采样,并在多种场景下使随机特征模型的性能接近最优网络。