Big datasets are gathered daily from different remote sensing platforms. Recently, statistical co-kriging models, with the help of scalable techniques, have been able to combine such datasets by using spatially varying bias corrections. The associated Bayesian inference for these models is usually facilitated via Markov chain Monte Carlo (MCMC) methods which present (sometimes prohibitively) slow mixing and convergence because they require the simulation of high-dimensional random effect vectors from their posteriors given large datasets. To enable fast inference in big data spatial problems, we propose the recursive nearest neighbor co-kriging (RNNC) model. Based on this model, we develop two computationally efficient inferential procedures: a) the collapsed RNNC which reduces the posterior sampling space by integrating out the latent processes, and b) the conjugate RNNC, an MCMC free inference which significantly reduces the computational time without sacrificing prediction accuracy. An important highlight of conjugate RNNC is that it enables fast inference in massive multifidelity data sets by avoiding expensive integration algorithms. The efficient computational and good predictive performances of our proposed algorithms are demonstrated on benchmark examples and the analysis of the High-resolution Infrared Radiation Sounder data gathered from two NOAA polar orbiting satellites in which we managed to reduce the computational time from multiple hours to just a few minutes.
翻译:大数据集每日从不同遥感平台收集而来。近年来,借助可扩展技术,统计协同克里金模型通过空间可变偏差校正实现了此类数据集的融合。这类模型的贝叶斯推断通常借助马尔可夫链蒙特卡洛(MCMC)方法实现,但由于需在大数据集下从后验分布中模拟高维随机效应向量,该方法导致混合与收敛过程极为缓慢(有时甚至不可行)。为应对大尺度空间问题中的快速推断需求,我们提出递归最近邻协同克里金(RNNC)模型。基于该模型,我们开发了两种计算高效的推断流程:a) 压缩RNNC方法,通过积分消去潜在过程缩小后验采样空间;b) 共轭RNNC方法,这是一种无MCMC的推断方法,可在不牺牲预测精度的前提下显著降低计算时间。共轭RNNC的重要优势在于,它通过避免复杂的积分算法,实现了海量多保真度数据集的快速推断。通过基准算例以及对两颗NOAA极轨卫星获取的高分辨率红外辐射探测器数据的分析,我们验证了所提算法的高效计算性能与优异预测表现——将计算时间从数小时缩短至仅需数分钟。