Spherically embedded spatial data are spatially indexed observations whose values naturally reside on or can be equivalently mapped to the unit sphere. Such data are increasingly ubiquitous in fields ranging from geochemistry to demography. However, analysing such data presents unique difficulties due to the intrinsic non-Euclidean nature of the sphere, and rigorous methodologies for statistical modelling, inference, and uncertainty quantification remain limited. This paper introduces a unified framework to address these three limitations for spherically embedded spatial data. We first propose a novel spherical spatial autoregressive model that leverages optimal transport geometry and then extend it to accommodate exogenous covariates. Second, for either scenario with or without covariates, we establish the asymptotic properties of the estimators and derive a distribution-free Wald test for spatial dependence, complemented by a bootstrap procedure to enhance finite-sample performance. Third, we contribute a novel approach to uncertainty quantification by developing a conformal prediction procedure specifically tailored to spherically embedded spatial data. The practical utility of these methodological advances is illustrated through extensive simulations and applications to Spanish geochemical compositions and Japanese age-at-death mortality distributions.
翻译:球面嵌入空间数据是指其观测值天然位于或可等价映射至单位球面的空间索引观测数据。此类数据在从地球化学到人口学等众多领域中日益普遍。然而,由于球面固有的非欧几里得性质,分析此类数据面临独特困难,且针对统计建模、推断及不确定性量化的严谨方法仍然有限。本文提出了一个统一框架,以解决球面嵌入空间数据在这三方面的局限性。我们首先提出了一种利用最优传输几何的新型球面空间自回归模型,随后将其扩展以纳入外生协变量。其次,无论是否包含协变量,我们均建立了估计量的渐近性质,并推导出用于空间依赖性检验的无分布Wald检验,辅以提升有限样本性能的自举程序。第三,我们通过专门为球面嵌入空间数据设计的保形预测程序,提出了一种不确定性量化的新方法。这些方法学进展的实际效用通过大量模拟研究,以及对西班牙地球化学成分和日本死亡年龄分布的应用得到了验证。