Existing theories on deep nonparametric regression have shown that when the input data lie on a low-dimensional manifold, deep neural networks can adapt to the intrinsic data structures. In real world applications, such an assumption of data lying exactly on a low dimensional manifold is stringent. This paper introduces a relaxed assumption that the input data are concentrated around a subset of $\mathbb{R}^d$ denoted by $\mathcal{S}$, and the intrinsic dimension of $\mathcal{S}$ can be characterized by a new complexity notation -- effective Minkowski dimension. We prove that, the sample complexity of deep nonparametric regression only depends on the effective Minkowski dimension of $\mathcal{S}$ denoted by $p$. We further illustrate our theoretical findings by considering nonparametric regression with an anisotropic Gaussian random design $N(0,\Sigma)$, where $\Sigma$ is full rank. When the eigenvalues of $\Sigma$ have an exponential or polynomial decay, the effective Minkowski dimension of such an Gaussian random design is $p=\mathcal{O}(\sqrt{\log n})$ or $p=\mathcal{O}(n^\gamma)$, respectively, where $n$ is the sample size and $\gamma\in(0,1)$ is a small constant depending on the polynomial decay rate. Our theory shows that, when the manifold assumption does not hold, deep neural networks can still adapt to the effective Minkowski dimension of the data, and circumvent the curse of the ambient dimensionality for moderate sample sizes.
翻译:现有关于深度非参数回归的理论表明,当输入数据位于低维流形上时,深度神经网络能够适应数据的固有结构。然而在实际应用中,数据严格位于低维流形上的这一假设过于严格。本文提出了一种放宽的假设:输入数据集中在 $\mathbb{R}^d$ 的子集 $\mathcal{S}$ 附近,而 $\mathcal{S}$ 的内在维数可通过一种新的复杂度度量——有效闵可夫斯基维数——来刻画。我们证明,深度非参数回归的样本复杂度仅依赖于 $\mathcal{S}$ 的有效闵可夫斯基维数 $p$。进一步,我们通过考虑各向异性高斯随机设计 $N(0,\Sigma)$(其中 $\Sigma$ 满秩)下的非参数回归来阐释理论结果。当 $\Sigma$ 的特征值呈指数衰减或多项式衰减时,该高斯随机设计的有效闵可夫斯基维数分别为 $p=\mathcal{O}(\sqrt{\log n})$ 或 $p=\mathcal{O}(n^\gamma)$,其中 $n$ 为样本量,$\gamma\in(0,1)$ 为取决于多项式衰减率的小常数。我们的理论表明,即使流形假设不成立,深度神经网络仍能适应数据的有效闵可夫斯基维数,并在中等样本量下规避环境维数灾难。