High-dimensional, higher-order tensor data are gaining prominence in a variety of fields, including but not limited to computer vision and network analysis. Tensor factor models, induced from noisy versions of tensor decompositions or factorizations, are natural potent instruments to study a collection of tensor-variate objects that may be dependent or independent. However, it is still in the early stage of developing statistical inferential theories for the estimation of various low-rank structures, which are customary to play the role of signals of tensor factor models. In this paper, we attempt to ``decode" the estimation of a higher-order tensor factor model by leveraging tensor matricization. Specifically, we recast it into mode-wise traditional high-dimensional vector/fiber factor models, enabling the deployment of conventional principal components analysis (PCA) for estimation. Demonstrated by the Tucker tensor factor model (TuTFaM), which is induced from the noisy version of the widely-used Tucker decomposition, we summarize that estimations on signal components are essentially mode-wise PCA techniques, and the involvement of projection and iteration will enhance the signal-to-noise ratio to various extent. We establish the inferential theory of the proposed estimators, conduct rich simulation experiments, and illustrate how the proposed estimations can work in tensor reconstruction, and clustering for independent video and dependent economic datasets, respectively.
翻译:高维高阶张量数据在计算机视觉、网络分析等诸多领域的重要性日益凸显。张量因子模型——源于张量分解或因子化的含噪版本——是研究可能相关或独立的张量变量集合的天然有效工具。然而,针对各类低秩结构(通常作为张量因子模型的信号成分)估计的统计推断理论发展仍处于初级阶段。本文试图通过张量矩阵化来“解码”高阶张量因子模型的估计问题。具体而言,我们将该问题重构为模态层面的传统高维向量/纤维因子模型,从而能够运用经典主成分分析(PCA)进行估计。以广泛使用的Tucker分解含噪版本所导出的Tucker张量因子模型(TuTFaM)为例,我们论证了信号分量的估计本质上是模态PCA技术,而投影与迭代的引入能在不同程度上提升信噪比。我们建立了所提出估计量的推断理论,开展了丰富的模拟实验,并分别通过独立视频数据集和相依经济数据集,展示了所提方法在张量重构与聚类任务中的实际应用。