In spatial statistics and machine learning, the kernel matrix plays a pivotal role in prediction, classification, and maximum likelihood estimation. A thorough examination reveals that for large sample sizes, the kernel matrix becomes ill-conditioned, provided the sampling locations are fairly evenly distributed. This condition poses significant challenges to numerical algorithms used in prediction and estimation computations and necessitates an approximation to prediction and the Gaussian likelihood. A review of current methodologies for managing large spatial data indicates that some fail to address this ill-conditioning problem. Such ill-conditioning often results in low-rank approximations of the stochastic processes. This paper introduces various optimality criteria and provides solutions for each.
翻译:在空间统计学与机器学习中,核矩阵在预测、分类及极大似然估计中发挥着关键作用。深入研究表明,当样本量较大且采样位置分布相对均匀时,核矩阵会呈现病态特征。这一状况对用于预测和估计计算的数值算法构成重大挑战,并迫使研究者对预测过程及高斯似然函数进行近似处理。对当前大尺度空间数据处理方法的回顾表明,部分方法未能有效解决这种病态问题。此类病态性通常导致随机过程的低秩近似。本文引入多种最优性准则,并为每种准则给出了相应解决方案。