In this paper, we consider robust nonparametric regression using deep neural networks with ReLU activation function. While several existing theoretically justified methods are geared towards robustness against identical heavy-tailed noise distributions, the rise of adversarial attacks has emphasized the importance of safeguarding estimation procedures against systematic contamination. We approach this statistical issue by shifting our focus towards estimating conditional distributions. To address it robustly, we introduce a novel estimation procedure based on $\ell$-estimation. Under a mild model assumption, we establish general non-asymptotic risk bounds for the resulting estimators, showcasing their robustness against contamination, outliers, and model misspecification. We then delve into the application of our approach using deep ReLU neural networks. When the model is well-specified and the regression function belongs to an $\alpha$-H\"older class, employing $\ell$-type estimation on suitable networks enables the resulting estimators to achieve the minimax optimal rate of convergence. Additionally, we demonstrate that deep $\ell$-type estimators can circumvent the curse of dimensionality by assuming the regression function closely resembles the composition of several H\"older functions. To attain this, new deep fully-connected ReLU neural networks have been designed to approximate this composition class. This approximation result can be of independent interest.
翻译:本文研究了使用具有ReLU激活函数的深度神经网络进行鲁棒非参数回归。尽管现有若干理论上成立的方法旨在应对同质重尾噪声分布的鲁棒性问题,但对抗攻击的兴起凸显了保护估计过程免受系统性污染的重要性。我们通过将关注点转向条件分布估计来解决这一统计问题。为鲁棒地进行估计,我们引入了一种基于$\ell$估计的新型估计程序。在温和的模型假设下,我们为所得估计量建立了通用的非渐近风险界,展示了其对污染、异常值和模型误设的鲁棒性。随后,我们深入探讨了该方法在深度ReLU神经网络中的应用。当模型设定正确且回归函数属于$\alpha$-Hölder类时,在合适网络上采用$\ell$型估计可使所得估计量达到极小化最优收敛速率。此外,我们证明深层$\ell$型估计量可通过假设回归函数近似为若干Hölder函数的复合来规避维度灾难。为此,我们设计了新型深度全连接ReLU神经网络以逼近该复合函数类,该逼近结果可能具有独立的研究价值。