A Non-stationary, Amortized, Transfer Learning Approach for Modeling Italian Air Quality

Air quality monitoring in Italy relies on sparse, irregular, ground-based stations that provide high-quality but incomplete measurements of pollution. Chemical transport models (CTMs) offer full spatial and temporal coverage but smooth over local variability. We develop a spatial transfer-learning framework that integrates these two data sources to produce daily, fine-grid predictions of nitrogen dioxide (NO$_2$) concentrations across Italy for 2023, with uncertainty quantification. The resulting maps provide a resource for decision making in downstream applications such as epidemiology and environmental policy. Our approach builds on the geostatistical LatticeKrig framework, which uses compactly supported basis functions and coefficients governed by a sparse precision matrix. We learn a nonstationary, anisotropic correlation structure from the gridded CTM outputs using an image-to-image neural architecture that estimates millions of spatially varying parameters in a matter of seconds. The basis-function representation enables this covariance structure to be transferred to the point-level station data and projected onto a finer prediction grid, a key extension for handling the change of support between data sources. A likelihood-based refinement step then adjusts the correlation range to recover fine-scale variability smoothed out by the gridded data. The proposed methodology results in a flexible, non-stationary, and anisotropic representation of the spatial process, better accommodating the complex geography of Italy. Performance is assessed through experiments on both gridded CTM outputs and point-level station measurements, demonstrating improvements over the stationary formulation.

翻译：意大利的空气质量监测依赖于稀疏、不规则的陆地监测站，这些站点能提供高质量但非连续性的污染测量数据。化学传输模型（CTM）虽能提供完整的空间和时间覆盖，但会平滑局部变异性。我们开发了一种空间迁移学习框架，整合两类数据源，生成2023年意大利全境每日细粒度二氧化氮（NO₂）浓度预测，并附带不确定性量化。所得地图可为流行病学、环境政策等下游应用提供决策支持。本方法基于地统计学LatticeKrig框架，该框架采用紧支撑基函数及由稀疏精度矩阵控制的系数。我们利用图像到图像的神经网络架构，从网格化CTM输出中学习非平稳、各向异性的相关结构，可在数秒内估算数百万个空间变化参数。基函数表示允许将该协方差结构迁移至站点级观测数据，并投影到更精细的预测网格——这是处理数据源间支撑域变化的关键扩展。通过基于似然的精化步骤调整相关范围，可恢复被网格数据平滑的细尺度变异性。本方法构建了空间过程的灵活、非平稳及各向异性表征，能更好地适应意大利复杂的地理特征。通过网格化CTM输出与站点级观测实验验证，性能较平稳公式化方法有明显提升。