Deep neural networks have achieved tremendous success due to their representation power and adaptation to low-dimensional structures. Their potential for estimating structured regression functions has been recently established in the literature. However, most of the studies require the input dimension to be fixed and consequently ignore the effect of dimension on the rate of convergence and hamper their applications to modern big data with high dimensionality. In this paper, we bridge this gap by analyzing a $k^{th}$ order nonparametric interaction model in both growing dimension scenarios ($d$ grows with $n$ but at a slower rate) and in high dimension ($d \gtrsim n$). In the latter case, sparsity assumptions and associated regularization are required in order to obtain optimal rates of convergence. A new challenge in diverging dimension setting is in calculation mean-square error, the covariance terms among estimated additive components are an order of magnitude larger than those of the variances and they can deteriorate statistical properties without proper care. We introduce a critical debiasing technique to amend the problem. We show that under certain standard assumptions, debiased deep neural networks achieve a minimax optimal rate both in terms of $(n, d)$. Our proof techniques rely crucially on a novel debiasing technique that makes the covariances of additive components negligible in the mean-square error calculation. In addition, we establish the matching lower bounds.
翻译:深度神经网络因其强大的表示能力和对低维结构的适应性而取得了巨大成功。近年来,文献中已初步确立了其在估计结构化回归函数方面的潜力。然而,大多数研究要求输入维数固定,从而忽略了维数对收敛速度的影响,限制了其在高维现代大数据中的应用。本文通过分析一个k阶非参数交互模型,填补了这一空白,该模型适用于维数增长场景(d随n增长但速度较慢)和高维场景(d≳n)。在高维情况下,需要引入稀疏性假设及相应的正则化方法以获得最优收敛速率。维数发散设定下的新挑战在于均方误差计算中,估计加性分量之间的协方差项比方差项大一个数量级,若不妥善处理会恶化统计性质。我们引入了一种关键的去偏技术以解决该问题。在标准假设下,我们证明去偏深度神经网络在(n, d)双重维度上达到了极小化最优收敛速率。我们的证明技术核心依赖于一种新颖的去偏方法,该方法使加性分量的协方差在均方误差计算中可被忽略。此外,我们还建立了匹配的下界。