Large datasets are daily gathered from different remote sensing platforms and statistical models are usually used to combine them by accounting for spatially varying bias corrections. The statistical inference of these models is usually based on Markov chain Monte Carlo (MCMC) samplers which involve updating a high-dimensional random effect vector and hence present slow mixing and convergence. To overcome this and enable fast inference in big spatial data problems, we propose the recursive nearest neighbor co-kriging (RNNC) model and use it as a framework which allows us to develop two computationally efficient inferential procedures: a) the collapsed RNNC that reduces the posterior sampling space by integrating out the latent processes, and b) the conjugate RNNC which is an MCMC free inference that significantly reduces the computational time without sacrificing prediction accuracy. The good computational and predictive performance of our proposed algorithms are demonstrated on benchmark examples and the analysis of the High-resolution Infrared Radiation Sounder data gathered from two NOAA polar orbiting satellites in which we managed to reduce the computational time from multiple hours to just a few minutes.
翻译:来自不同遥感平台的大型数据集每日得以收集,通常需要借助统计模型,通过考虑空间变化的偏差校正来将它们整合在一起。这些模型的统计推断通常基于马尔可夫链蒙特卡洛(MCMC)采样器,其过程涉及更新高维随机效应向量,进而呈现出较慢的混合与收敛速度。为克服这一难题并实现对大规模空间数据问题的快速推断,我们提出了递归最近邻协同克里金(RNNC)模型,并将其作为一个框架,据此开发出两种计算高效的推断方法:a) 折叠式RNNC,通过对潜在过程进行积分来缩减后验采样空间;b) 共轭RNNC,这是一种免于MCMC的推断方法,能在不牺牲预测精度的前提下显著减少计算时间。我们通过基准示例以及对来自两颗NOAA极轨卫星的高分辨率红外辐射探测器数据的分析,展示了所提算法良好的计算与预测性能,在该数据分析中,我们将计算时间从数小时缩短至仅需几分钟。