In this paper, we develop new methods for analyzing high-dimensional tensor datasets. A tensor factor model describes a high-dimensional dataset as a sum of a low-rank component and an idiosyncratic noise, generalizing traditional factor models for panel data. We propose an estimation algorithm, called tensor principal component analysis (TPCA), which generalizes the traditional PCA applicable to panel data. The algorithm involves unfolding the tensor into a sequence of matrices along different dimensions and applying PCA to the unfolded matrices. We provide theoretical results on the consistency and asymptotic distribution for the TPCA estimator of loadings and factors. We also introduce a novel test for the number of factors in a tensor factor model. The TPCA and the test feature good performance in Monte Carlo experiments and are applied to sorted portfolios.
翻译:本文开发了用于分析高维张量数据集的新方法。张量因子模型将高维数据集描述为低秩分量与特质噪声之和,推广了面板数据的传统因子模型。我们提出了一种估计算法——张量主成分分析(TPCA),该算法将传统适用于面板数据的PCA方法进行推广。该算法通过沿不同维度将张量展开为矩阵序列,并对展开后的矩阵应用PCA。针对载荷与因子的TPCA估计量,我们给出了关于一致性与渐近分布的理论结果。同时,我们提出了张量因子模型中因子个数的全新检验方法。蒙特卡洛实验表明,TPCA及该检验方法均表现出优异性能,并已应用于排序投资组合分析。