Deep neural networks (DNN) has received increasing attention in machine learning applications in the last several years. Recently, a non-asymptotic error bound has been developed to measure the performance of the fully connected DNN estimator with ReLU activation functions for estimating regression models. The paper at hand gives a small improvement on the current error bound based on the latest results on the approximation ability of DNN. More importantly, however, a non-random subsampling technique--scalable subsampling--is applied to construct a `subagged' DNN estimator. Under regularity conditions, it is shown that the subagged DNN estimator is computationally efficient without sacrificing accuracy for either estimation or prediction tasks. Beyond point estimation/prediction, we propose different approaches to build confidence and prediction intervals based on the subagged DNN estimator. In addition to being asymptotically valid, the proposed confidence/prediction intervals appear to work well in finite samples. All in all, the scalable subsampling DNN estimator offers the complete package in terms of statistical inference, i.e., (a) computational efficiency; (b) point estimation/prediction accuracy; and (c) allowing for the construction of practically useful confidence and prediction intervals.
翻译:深度神经网络(DNN)在过去几年中在机器学习应用中受到越来越多的关注。近期,研究者发展了一种非渐近误差界,用于衡量具有ReLU激活函数的全连接DNN估计量在回归模型中的表现。本文基于DNN逼近能力的最新研究成果,对当前误差界进行了小幅改进。更重要的是,我们应用了一种非随机子采样技术——可扩展子采样——来构建“子装袋”DNN估计量。在正则性条件下,研究表明子装袋DNN估计量在计算效率上具有优势,且不损失估计或预测任务的精度。除了点估计/预测外,我们提出了基于子装袋DNN估计量构建置信区间和预测区间的不同方法。所提出的置信/预测区间不仅具有渐近有效性,在有限样本中也表现良好。总体而言,可扩展子采样DNN估计量在统计推断方面提供了完整方案,即:(a) 计算效率;(b) 点估计/预测精度;(c) 允许构建具有实际应用价值的置信区间和预测区间。