Identifying spatial heterogeneous patterns has attracted a surge of research interest in recent years, due to its important applications in various scientific and engineering fields. In practice the spatially heterogeneous components are often mixed with components which are spatially smooth, making the task of identifying the heterogeneous regions more challenging. In this paper, we develop an efficient clustering approach to identify the model heterogeneity of the spatial additive partial linear model. Specifically, we aim to detect the spatially contiguous clusters based on the regression coefficients while introducing a spatially varying intercept to deal with the smooth spatial effect. On the one hand, to approximate the spatial varying intercept, we use the method of bivariate spline over triangulation, which can effectively handle the data from a complex domain. On the other hand, a novel fusion penalty termed the forest lasso is proposed to reveal the spatial clustering pattern. Our proposed fusion penalty has advantages in both the estimation and computation efficiencies when dealing with large spatial data. Theoretically properties of our estimator are established, and simulation results show that our approach can achieve more accurate estimation with a limited computation cost compared with the existing approaches. To illustrate its practical use, we apply our approach to analyze the spatial pattern of the relationship between land surface temperature measured by satellites and air temperature measured by ground stations in the United States.
翻译:近年来,识别空间异质模式因其在多个科学与工程领域的重要应用而引发研究热潮。实际中,空间异质成分常与空间平滑成分混杂,使得异质区域识别任务更具挑战性。本文提出一种高效聚类方法用于识别空间加性部分线性模型的模型异质性。具体而言,我们旨在基于回归系数检测空间连续聚类,同时引入空间变截距处理平滑空间效应。一方面,通过三角剖分上的双变量样条方法逼近空间变截距,该方法能有效处理复杂域数据;另一方面,提出新型融合惩罚项——森林Lasso以揭示空间聚类模式。所提融合惩罚在处理大规模空间数据时具有估计效率与计算效率的双重优势。我们建立了估计量的理论性质,仿真结果表明,与现有方法相比,该方法能以有限计算成本实现更精确的估计。为展示其实用价值,我们将该方法应用于分析美国卫星地表温度与地面站气温关系的空间模式。