Motivated by the need for analysing large spatio-temporal panel data, we introduce a novel dimensionality reduction methodology for $n$-dimensional random fields observed across a number $S$ spatial locations and $T$ time periods. We call it General Spatio-Temporal Factor Model (GSTFM). First, we provide the probabilistic and mathematical underpinning needed for the representation of a random field as the sum of two components: the common component (driven by a small number $q$ of latent factors) and the idiosyncratic component (mildly cross-correlated). We show that the two components are identified as $n\to\infty$. Second, we propose an estimator of the common component and derive its statistical guarantees (consistency and rate of convergence) as $\min(n, S, T )\to\infty$. Third, we propose an information criterion to determine the number of factors. Estimation makes use of Fourier analysis in the frequency domain and thus we fully exploit the information on the spatio-temporal covariance structure of the whole panel. Synthetic data examples illustrate the applicability of GSTFM and its advantages over the extant generalized dynamic factor model that ignores the spatial correlations.
翻译:受大规模时空面板数据分析需求的驱动,我们提出了一种针对在$S$个空间位置和$T$个时间周期上观测的$n$维随机场的新型降维方法,称之为一般时空因子模型(GSTFM)。首先,我们提供了将随机场表示为两个分量之和所需的概率与数学基础:公共分量(由少量$q$个潜在因子驱动)和异质分量(具有轻度交叉相关性)。我们证明当$n\to\infty$时这两个分量是可识别的。其次,我们提出了公共分量的估计量,并推导了当$\min(n, S, T)\to\infty$时的统计保证(一致性与收敛速率)。再次,我们提出了用于确定因子数量的信息准则。该估计方法利用了频域中的傅里叶分析,从而充分挖掘了整个面板时空协方差结构中的信息。合成数据示例展示了GSTFM的适用性及其相对于忽视空间相关性的现有广义动态因子模型的优势。