Evaluating the predictive performance of a statistical model is commonly done using cross-validation. Although the leave-one-out method is frequently employed, its application is justified primarily for independent and identically distributed observations. However, this method tends to mimic interpolation rather than prediction when dealing with dependent observations. This paper proposes a modified cross-validation for dependent observations. This is achieved by excluding an automatically determined set of observations from the training set to mimic a more reasonable prediction scenario. Also, within the framework of latent Gaussian models, we illustrate a method to adjust the joint posterior for this modified cross-validation to avoid model refitting. This new approach is accessible in the R-INLA package (www.r-inla.org).
翻译:评估统计模型的预测性能通常采用交叉验证方法。尽管留一法被频繁使用,但其应用主要适用于独立同分布的观测数据。然而,在处理相关观测数据时,该方法更倾向于模拟插值而非预测。本文针对相关观测数据提出了一种改进的交叉验证方法,通过自动排除训练集中特定观测组来模拟更合理的预测场景。同时,在潜在高斯模型框架下,我们阐述了一种调整改进交叉验证后联合后验分布的方法,从而避免模型重拟合。这一新方法已在R-INLA软件包(www.r-inla.org)中实现。