In this paper we analyze the $L_2$ error of neural network regression estimates with one hidden layer. Under the assumption that the Fourier transform of the regression function decays suitably fast, we show that an estimate, where all initial weights are chosen according to proper uniform distributions and where the weights are learned by gradient descent, achieves a rate of convergence of $1/\sqrt{n}$ (up to a logarithmic factor). Our statistical analysis implies that the key aspect behind this result is the proper choice of the initial inner weights and the adjustment of the outer weights via gradient descent. This indicates that we can also simply use linear least squares to choose the outer weights. We prove a corresponding theoretical result and compare our new linear least squares neural network estimate with standard neural network estimates via simulated data. Our simulations show that our theoretical considerations lead to an estimate with an improved performance in many cases.
翻译:本文分析了具有一个隐藏层的神经网络回归估计的 $L_2$ 误差。在假设回归函数的傅里叶变换以适当速度衰减的条件下,我们证明:若所有初始权重按合适的均匀分布选取,并通过梯度下降学习权重,则该估计可达到 $1/\sqrt{n}$ 的收敛速率(至多相差一个对数因子)。我们的统计分析表明,该结果的关键在于初始内层权重的恰当选择以及通过梯度下降调整外层权重。这提示我们也可直接使用线性最小二乘法选择外层权重。我们证明了相应的理论结果,并通过模拟数据将新的线性最小二乘神经网络估计与标准神经网络估计进行了比较。模拟表明,在许多情况下,我们的理论分析能带来性能更优的估计。