Predicting species distributions using occupancy models accounting for imperfect detection is now commonplace in ecology. Recently, modelling spatial and temporal autocorrelation was proposed to alleviate the lack of replication in occupancy data, which often prevents model identifiability. However, how such models perform in highly heterogeneous datasets where missing or single-visit data dominates remains an open question. Motivated by an heterogeneous fine-scale butterfly occupancy dataset, we evaluate the performance of a multi-season occupancy model with spatial and temporal random effects to a skewed (Poisson) distribution of the number of surveys per site, overlap of covariates between occupancy and detection submodels, and spatiotemporal clustering of observations. Results showed that the model is robust to heterogeneous data and covariate overlap. However, when spatiotemporal gaps were added, site occupancy was biased towards the average occupancy, itself overestimated. Random effects did not correct the influence of gaps, due to identifiability issues of variance and autocorrelation parameters. Occupancy analysis of two butterfly species further confirmed these results. Overall, multi-season occupancy models with autocorrelation are robust to heterogeneous data and covariate overlap, but still present identifiability issues and are challenged by severe data gaps, which compromise predictions even in data-rich areas.
翻译:利用考虑不完全检测的占据模型预测物种分布已成为生态学中的常规方法。最近,有研究提出对空间和时间自相关进行建模,以缓解占据数据中因缺乏重复观测而常导致模型不可识别的问题。然而,此类模型在缺失数据或单次访问数据占主导的高度异质性数据集中的表现仍是一个悬而未决的问题。基于一个异质性的精细尺度蝴蝶占据数据集,我们评估了具有空间和时间随机效应的多季节占据模型对以下情况的性能:每个位点调查次数的偏态(泊松)分布、占据子模型与检测子模型间协变量的重叠,以及观测数据的时空聚类。结果表明,该模型对异质性数据和协变量重叠具有稳健性。然而,当加入时空数据缺口时,位点占据率会偏向于平均占据率,而后者本身被高估。由于方差和自相关参数的不可识别性问题,随机效应未能纠正数据缺口的影响。对两种蝴蝶物种的占据分析进一步证实了这些结果。总体而言,具有自相关性的多季节占据模型对异质性数据和协变量重叠具有稳健性,但仍存在可识别性问题,并且在面对严重的数据缺口时面临挑战,即使在数据丰富的区域也会影响预测的可靠性。