We propose a new scalable framework for spatio-temporal data fusion with multi-fidelity Gaussian processes (MFGPs) that enables fully likelihood-based inference for both stationary and non-stationary fidelity integration. The framework is designed for environmental applications, where abundant but noisy low-fidelity data (e.g., satellite or reanalysis products) must be fused with sparse yet accurate high-fidelity in-situ observations to obtain high-resolution reconstructions. Our key methodological contribution is a decomposed multi-fidelity covariance formulation that allows the Vecchia approximation to be applied directly to the latent low-fidelity and discrepancy processes. Combined with a Woodbury-based reconstruction, this yields a numerically stable and computationally efficient evaluation of the joint marginal likelihood without ever forming the full multi-fidelity covariance matrix. In addition, we introduce a generalized least squares (GLS) mean-removal strategy with fidelity-specific offsets, preventing systematic biases from being absorbed into cross-fidelity dependence. We validate the proposed approach through extensive experiments on synthetic data and a large-scale real-world application to wind speed reconstruction in the Lombardy region of Italy. The results show that the proposed Vecchia-based MFGP closely matches exact multi-fidelity inference in controlled settings, while substantially outperforming standard single-fidelity spatio-temporal Gaussian processes in terms of predictive accuracy, correlation, and representation of local variability in realistic large-data scenarios.
翻译:我们提出了一种新的可扩展多保真高斯过程(MFGP)时空数据融合框架,能够对平稳与非平稳保真度积分实现完全基于似然的推断。该框架专为环境应用设计,需将大量但含噪的低保真数据(如卫星或再分析产品)与稀疏但精确的高保真原位观测数据融合,以获得高分辨率重建结果。我们的主要方法贡献在于提出一种分解式多保真协方差公式,使得Vecchia近似可直接应用于潜在低保真过程与差异过程。结合基于Woodbury公式的重建方法,该方法可在不构建完整多保真协方差矩阵的情况下,对联合边际似然进行数值稳定且计算高效的评估。此外,我们引入了一种带保真度特定偏移量的广义最小二乘(GLS)均值移除策略,避免系统偏差被吸收至交叉保真度依赖中。通过合成数据的广泛实验以及意大利伦巴第大区风速重建的大规模实际应用,我们对所提方法进行了验证。结果表明,在受控条件下,基于Vecchia近似的MFGP与精确多保真推断匹配良好,同时在预测精度、相关性以及实际大数据场景局部变异性表征方面,显著优于标准单保真度时空高斯过程。