When the unknown regression function of a single variable is known to have derivatives up to the $(\gamma+1)$th order bounded in absolute values by a common constant everywhere or a.e. (i.e., $(\gamma+1)$th degree of smoothness), the minimax optimal rate of the mean integrated squared error (MISE) is stated as $\left(\frac{1}{n}\right)^{\frac{2\gamma+2}{2\gamma+3}}$ in the literature. This paper shows that: (i) if $n\leq\left(\gamma+1\right)^{2\gamma+3}$, the minimax optimal MISE rate is $\frac{\log n}{n\log(\log n)}$ and the optimal degree of smoothness to exploit is roughly $\max\left\{ \left\lfloor \frac{\log n}{2\log\left(\log n\right)}\right\rfloor ,\,1\right\} $; (ii) if $n>\left(\gamma+1\right)^{2\gamma+3}$, the minimax optimal MISE rate is $\left(\frac{1}{n}\right)^{\frac{2\gamma+2}{2\gamma+3}}$ and the optimal degree of smoothness to exploit is $\gamma+1$. The fundamental contribution of this paper is a set of metric entropy bounds we develop for smooth function classes. Some of our bounds are original, and some of them improve and/or generalize the ones in the literature (e.g., Kolmogorov and Tikhomirov, 1959). Our metric entropy bounds allow us to show phase transitions in the minimax optimal MISE rates associated with some commonly seen smoothness classes as well as non-standard smoothness classes, and can also be of independent interest outside the nonparametric regression problems.
翻译:当单变量未知回归函数已知其各阶导数直至第$(\gamma+1)$阶的绝对值处处或几乎处处有公共常数上界(即$(\gamma+1)$阶光滑度)时,文献中给出的均方积分误差(MISE)的极小化最优速率为$\left(\frac{1}{n}\right)^{\frac{2\gamma+2}{2\gamma+3}}$。本文证明:(i)若$n\leq\left(\gamma+1\right)^{2\gamma+3}$,则极小化最优MISE速率为$\frac{\log n}{n\log(\log n)}$,且最优利用的光滑度约为$\max\left\{ \left\lfloor \frac{\log n}{2\log\left(\log n\right)}\right\rfloor ,\,1\right\}$;(ii)若$n>\left(\gamma+1\right)^{2\gamma+3}$,则极小化最优MISE速率为$\left(\frac{1}{n}\right)^{\frac{2\gamma+2}{2\gamma+3}}$,且最优利用的光滑度为$\gamma+1$。本文的根本贡献在于为光滑函数类建立了一组度量熵界。其中部分熵界为原创结果,部分改进和/或推广了文献中的已有结论(例如Kolmogorov与Tikhomirov, 1959)。这些度量熵界不仅使我们能够揭示若干常见光滑类及非标准光滑类在极小化最优MISE速率中的相变现象,且其应用价值可独立于非参数回归问题之外。