Geostatistics aims to infer a spatially continuous phenomenon from observations collected at a finite number of locations, frequently measured with error. Whenever there is stochastic dependence between the spatial and sampling processes, preferential sampling occurs. Ignoring this problem drives to incorrect and biased estimates and, therefore, recognizing it is quite important, but not always simple to execute and understand. In this work, a test for assessing preferential sampling, simple and easy to implement, is presented, overcoming the previous concerns. It is based on the dependence between the number of sampled points and the values of the corresponding measures. The performance of the proposed test id assessed through a large simulation study, which consideres different levels of preferentiability, relation with a covariate, different sample sizes and different test procedure conditions. The results are quite encouraging, with high levels of correct preferential sampling detections, further confirmed by the test application to already known real data sets of lead concentrations in moss samples and red and blue shrimp capture data.
翻译:地质统计学旨在从有限位置收集的观测数据(通常存在测量误差)中推断空间连续现象。当空间过程与采样过程存在随机依赖时,就会发生优先采样。忽略这一问题会导致有偏且不准确的估计,因此识别优先采样极为重要,但这一过程往往不易实施且难以理解。本研究提出了一种简单易行的优先采样检验方法,克服了上述困难。该方法基于采样点数量与对应测量值之间的依赖关系。通过涵盖不同偏好程度、协变量关联性、样本量及检验条件的大型模拟研究,评估了所提方法的性能。结果表明,该方法具有较高的优先采样检测准确率,并在已知真实数据集(苔藓样品中铅浓度数据以及红虾与蓝虾捕获数据)的检验应用中得到进一步证实。