Most currently used tensor regression models for high-dimensional data are based on Tucker decomposition, which has good properties but loses its efficiency in compressing tensors very quickly as the order of tensors increases, say greater than four or five. However, for the simplest tensor autoregression in handling time series data, its coefficient tensor already has the order of six. This paper revises a newly proposed tensor train (TT) decomposition and then applies it to tensor regression such that a nice statistical interpretation can be obtained. The new tensor regression can well match the data with hierarchical structures, and it even can lead to a better interpretation for the data with factorial structures, which are supposed to be better fitted by models with Tucker decomposition. More importantly, the new tensor regression can be easily applied to the case with higher order tensors since TT decomposition can compress the coefficient tensors much more efficiently. The methodology is also extended to tensor autoregression for time series data, and nonasymptotic properties are derived for the ordinary least squares estimations of both tensor regression and autoregression. A new algorithm is introduced to search for estimators, and its theoretical justification is also discussed. Theoretical and computational properties of the proposed methodology are verified by simulation studies, and the advantages over existing methods are illustrated by two real examples.
翻译:当前用于高维数据的张量回归模型大多基于Tucker分解,该分解虽具有良好性质,但随着张量阶数增加(如超过四阶或五阶),其压缩效率会迅速下降。然而,在处理时间序列数据的最简张量自回归中,系数张量的阶数已达到六阶。本文对近期提出的张量列(TT)分解进行修正,并将其应用于张量回归,从而获得良好的统计解释能力。新型张量回归能有效拟合具有层级结构的数据,甚至对因子结构数据(该类数据通常被认为更适用于基于Tucker分解的模型)也能给出更优解释。更重要的是,由于TT分解可大幅提升系数张量的压缩效率,新型张量回归能轻松扩展至高阶张量场景。该方法还被推广至时间序列数据的张量自回归,并推导出张量回归与张量自回归在普通最小二乘估计下的非渐近性质。本文提出一种新的估计算法,并探讨了其理论依据。模拟研究验证了所提方法的理论与计算性能,两个实际案例则证明了其相较于现有方法的优势。