Gaussian processes are used in many machine learning applications that rely on uncertainty quantification. Recently, computational tools for working with these models in geometric settings, such as when inputs lie on a Riemannian manifold, have been developed. This raises the question: can these intrinsic models be shown theoretically to lead to better performance, compared to simply embedding all relevant quantities into $\mathbb{R}^d$ and using the restriction of an ordinary Euclidean Gaussian process? To study this, we prove optimal contraction rates for intrinsic Mat\'ern Gaussian processes defined on compact Riemannian manifolds. We also prove analogous rates for extrinsic processes using trace and extension theorems between manifold and ambient Sobolev spaces: somewhat surprisingly, the rates obtained turn out to coincide with those of the intrinsic processes, provided that their smoothness parameters are matched appropriately. We illustrate these rates empirically on a number of examples, which, mirroring prior work, show that intrinsic processes can achieve better performance in practice. Therefore, our work shows that finer-grained analyses are needed to distinguish between different levels of data-efficiency of geometric Gaussian processes, particularly in settings which involve small data set sizes and non-asymptotic behavior.
翻译:高斯过程被广泛应用于依赖不确定性量化的机器学习任务中。近年来,研究者开发了在几何场景中处理这些模型的计算工具,例如当输入位于黎曼流形上时。这引发了一个问题:与将所有相关量嵌入$\mathbb{R}^d$并使用欧几里得高斯过程的限制相比,这些内蕴模型能否从理论上证明能带来更优性能?为探讨此问题,我们证明了定义在紧致黎曼流形上的内蕴Matérn高斯过程的最优收缩率。我们还利用流形与环境Sobolev空间之间的迹定理和延拓定理,证明了外蕴过程的类似收缩率:令人惊讶的是,当两者的光滑参数适当匹配时,所获得的收缩率与内蕴过程的收缩率一致。我们通过多个实例对这些收缩率进行了实证验证,结果与先前研究一致,表明内蕴过程在实践中能实现更优性能。因此,我们的工作表明,需要更精细的分析来区分几何高斯过程在不同数据效率层面的表现,尤其是在小样本数据集和非渐近行为场景中。