We derive optimal rates of convergence in the supremum norm for estimating the H\"older-smooth mean function of a stochastic process which is repeatedly and discretely observed with additional errors at fixed, multivariate, synchronous design points, the typical scenario for machine recorded functional data. Similarly to the optimal rates in $L_2$ obtained in \citet{cai2011optimal}, for sparse design a discretization term dominates, while in the dense case the parametric $\sqrt n$ rate can be achieved as if the $n$ processes were continuously observed without errors. The supremum norm is of practical interest since it corresponds to the visualization of the estimation error, and forms the basis for the construction uniform confidence bands. We show that in contrast to the analysis in $L_2$, there is an intermediate regime between the sparse and dense cases dominated by the contribution of the observation errors. Furthermore, under the supremum norm interpolation estimators which suffice in $L_2$ turn out to be sub-optimal in the dense setting, which helps to explain their poor empirical performance. In contrast to previous contributions involving the supremum norm, we discuss optimality even in the multivariate setting, and for dense design obtain the $\sqrt n$ rate of convergence without additional logarithmic factors. We also obtain a central limit theorem in the supremum norm, and provide simulations and real data applications to illustrate our results.
翻译:我们推导了在最大模下估计随机过程Hölder光滑均值函数的最优收敛速率。该过程在固定、多元、同步设计点处被重复离散观测并附加测量误差,这是机器记录函数型数据的典型场景。与\citet{cai2011optimal} 在$L_2$范数下获得的最优速率类似,稀疏设计中离散化项占主导地位,而密集情形下可实现参数化的$\sqrt n$速率,仿佛这$n$个过程是在无误差条件下连续观测的。最大模范数具有实际意义,因为它对应估计误差的可视化,并为构建均匀置信带奠定基础。我们证明,与$L_2$分析不同,稀疏与密集情形之间存在由观测误差贡献主导的中间机制。此外,在最大模范数下,$L_2$中足以适用的插值估计量在密集设定下呈现次优性,这解释了其实际表现的欠佳。与先前涉及最大模范数的研究相比,我们讨论了即使在多元设定下的最优性,并在密集设计下获得了不含额外对数因子的$\sqrt n$收敛速率。我们还得到了最大模范数下的中心极限定理,并通过模拟和实际数据应用验证了我们的结论。