Traditional geostatistical methods assume independence between observation locations and the spatial process of interest. Violations of this independence assumption are referred to as preferential sampling (PS). Standard methods to address PS rely on estimating complex shared latent variable models and can be difficult to apply in practice. We study the use of inverse sampling intensity weighting (ISIW) for PS adjustment in model-based geostatistics. ISIW is a two-stage approach wherein we estimate the sampling intensity of the observation locations then define intensity-based weights within a weighted likelihood adjustment. Prediction follows by substituting the adjusted parameter estimates within kriging. We introduce an implementation of ISIW based on the Vecchia approximation, enabling computational gains while maintaining strong predictive accuracy. Interestingly, we found that ISIW outpredicts standard PS methods under misspecification of the sampling design, and that accurate parameter estimation had little correlation with predictive performance, raising questions about the conditions driving optimal implementation of kriging-based predictors under PS. Our work highlights the potential of ISIW to adjust for PS in an intuitive, fast, and effective manner. We illustrate these ideas on spatial prediction of lead concentrations measured through moss biomonitoring data in Galicia, Spain, and PM2.5 concentrations from the U.S. EPA Air Quality System network in California.
翻译:传统地统计学方法假设观测位置与目标空间过程相互独立。违反这一独立性假设的情形被称为偏好性采样。处理偏好性采样的标准方法依赖于估计复杂的共享潜变量模型,在实际应用中往往难以实施。本研究探讨了在基于模型的地统计学中,使用逆采样强度加权方法进行偏好性采样调整的效果。该方法采用两阶段策略:首先估计观测位置的采样强度,随后在加权似然调整框架内构建基于强度的权重。预测过程通过将调整后的参数估计量代入克里金法实现。我们提出了一种基于Vecchia近似的逆采样强度加权实施方案,在保持较强预测精度的同时实现了计算效率的提升。值得注意的是,研究发现当采样设计存在误设时,逆采样强度加权法的预测性能优于标准偏好性采样处理方法,且参数估计精度与预测性能之间关联性较弱,这引发了关于偏好性采样条件下驱动克里金预测器最优实施条件的思考。我们的工作表明,逆采样强度加权法能够以直观、快速且有效的方式实现偏好性采样调整。通过西班牙加利西亚地区苔藓生物监测数据中的铅浓度空间预测,以及美国加州环保局空气质量系统网络的PM2.5浓度预测案例,我们具体阐释了这些方法的实际应用。