This paper proposes a novel data-driven approach for identifying and modelling areas with similar temperature variations throufigureh clustering and Space-Time AutoRegressive (STAR) models. Using annual temperature data from 168 countries (1901-2022), we apply three clustering methods based on (i) warming rates, (ii) annual temperature variations, and (iii) persistence of variation signs, using Euclidean and Hamming distances. These clusters are then employed to construct alternative spatial weight matrices for STAR models. Empirical results show that distance-based STAR models outperform classical contiguity-based ones, both in-sample and out-of-sample, with the Hamming distance-based STAR model achieving the best predictive accuracy. The study demonstrates that using statistical similarity rather than geographical proximity improves the modelling of global temperature dynamics, suggesting broader applicability to other environmental and socioeconomic datasets.
翻译:本文提出了一种新颖的数据驱动方法,通过聚类与时空自回归(STAR)模型来识别和建模具有相似温度变化的区域。利用来自168个国家(1901-2022年)的年均温度数据,我们基于(i)变暖速率、(ii)年温度变化以及(iii)变化符号的持续性,采用欧几里得距离和汉明距离,应用了三种聚类方法。这些聚类随后被用于构建STAR模型的替代空间权重矩阵。实证结果表明,基于距离的STAR模型在样本内和样本外均优于传统的基于邻接关系的模型,其中基于汉明距离的STAR模型取得了最佳的预测精度。本研究证明,使用统计相似性而非地理邻近性能够改进全球温度动态的建模,并表明该方法可更广泛地应用于其他环境和社会经济数据集。