We propose a sparse deep ReLU network (SDRN) estimator of the regression function obtained from regularized empirical risk minimization with a Lipschitz loss function. Our framework can be applied to a variety of regression and classification problems. We establish novel non-asymptotic excess risk bounds for our SDRN estimator when the regression function belongs to a Sobolev space with mixed derivatives. We obtain a new nearly optimal risk rate in the sense that the SDRN estimator can achieve nearly the same optimal minimax convergence rate as one-dimensional nonparametric regression with the dimension only involved in a logarithm term when the feature dimension is fixed. The estimator has a slightly slower rate when the dimension grows with the sample size. We show that the depth of the SDRN estimator grows with the sample size in logarithmic order, and the total number of nodes and weights grows in polynomial order of the sample size to have the nearly optimal risk rate. The proposed SDRN can go deeper with fewer parameters to well estimate the regression and overcome the overfitting problem encountered by conventional feed-forward neural networks.
翻译:我们提出一种通过Lipschitz损失函数正则化经验风险最小化获得的回归函数的稀疏深度ReLU网络(SDRN)估计器。该框架可应用于多种回归与分类问题。当回归函数属于具有混合导数的Sobolev空间时,我们为SDRN估计器建立了新的非渐近超额风险界。我们获得了新的近似最优风险率:当特征维度固定时,SDRN估计器能够达到与一维非参数回归相同的最优极小极大收敛速率,且维度仅出现在对数项中;当维度随样本量增长时,估计器具有稍慢的收敛速率。我们证明SDRN估计器的深度随样本量呈对数阶增长,而节点与权重总数需随样本量呈多项式阶增长才能实现近似最优风险率。所提出的SDRN能够以更少的参数实现更深层结构,从而准确估计回归函数,并克服传统前馈神经网络遇到的过拟合问题。