Accurate spatial interpolation of the air quality index (AQI), computed from concentrations of multiple air pollutants, is essential for regulatory decision-making, yet AQI fields are inherently non-Gaussian and often exhibit complex nonlinear spatial structure. Classical spatial prediction methods such as kriging are linear and rely on Gaussian assumptions, which limits their ability to capture these features and to provide reliable predictive distributions. In this study, we propose \textit{deep classifier kriging} (DCK), a flexible, distribution-free deep learning framework for estimating full predictive distribution functions for univariate and bivariate spatial processes, together with a \textit{data fusion} mechanism that enables modeling of non-collocated bivariate processes and integration of heterogeneous air pollution data sources. Through extensive simulation experiments, we show that DCK consistently outperforms conventional approaches in predictive accuracy and uncertainty quantification. We further apply DCK to probabilistic spatial prediction of AQI by fusing sparse but high-quality station observations with spatially continuous yet biased auxiliary model outputs, yielding spatially resolved predictive distributions that support downstream tasks such as exceedance and extreme-event probability estimation for regulatory risk assessment and policy formulation.
翻译:空气质量指数(AQI)由多种空气污染物浓度计算得出,其准确的空间插值对于监管决策至关重要。然而,AQI场本质上具有非高斯特性,且常表现出复杂的非线性空间结构。经典的空间预测方法(如克里金法)是线性的,并依赖于高斯假设,这限制了它们捕捉这些特征以及提供可靠预测分布的能力。本研究提出了一种灵活、无分布的深度学习框架——\textit{深度分类克里金法}(DCK),用于估计单变量和双变量空间过程的完整预测分布函数,同时引入一种\textit{数据融合}机制,能够对非同位双变量过程进行建模,并整合异构的空气污染数据源。通过大量模拟实验,我们证明DCK在预测准确性和不确定性量化方面持续优于传统方法。我们进一步将DCK应用于AQI的概率空间预测,通过融合稀疏但高质量的地面站点观测数据与空间连续但存在偏差的辅助模型输出,生成了空间分辨的预测分布。这些分布支持下游任务,如用于监管风险评估和政策制定的超标概率与极端事件概率估计。