It is common to model a deterministic response function, such as the output of a computer experiment, as a Gaussian process with a Mat\'ern covariance kernel. The smoothness parameter of a Mat\'ern kernel determines many important properties of the model in the large data limit, including the rate of convergence of the conditional mean to the response function. We prove that the maximum likelihood estimate of the smoothness parameter cannot asymptotically undersmooth the truth when the data are obtained on a fixed bounded subset of $\mathbb{R}^d$. That is, if the data-generating response function has Sobolev smoothness $\nu_0 > d/2$, then the smoothness parameter estimate cannot be asymptotically less than $\nu_0$. The lower bound is sharp. Additionally, we show that maximum likelihood estimation recovers the true smoothness for a class of compactly supported self-similar functions. For cross-validation we prove an asymptotic lower bound $\nu_0 - d/2$, which however is unlikely to be sharp. The results are based on approximation theory in Sobolev spaces and some general theorems that restrict the set of values that the parameter estimators can take.
翻译:通常将确定性响应函数(例如计算机实验的输出)建模为具有马特恩协方差核的高斯过程。马特恩核的光滑参数在大数据极限下决定了模型的许多重要性质,包括条件均值对响应函数的收敛速率。当数据在$\mathbb{R}^d$的固定有界子集上获取时,我们证明光滑参数的最大似然估计无法渐近地低估真实值。即,若生成数据的响应函数具有索伯列夫光滑度$\nu_0 > d/2$,则光滑参数估计量不可能渐近小于$\nu_0$。该下界是紧的。此外,我们证明了对于一类紧支撑的自相似函数,最大似然估计能够恢复真实光滑度。对于交叉验证,我们证明了渐近下界$\nu_0 - d/2$,但该下界可能不紧。这些结果基于索伯列夫空间中的逼近理论以及限制参数估计量取值集合的若干一般性定理。