We propose tensor time series imputation when the missing pattern in the tensor data can be general, as long as any two data positions along a tensor fibre are both observed for enough time points. The method is based on a tensor time series factor model with Tucker decomposition of the common component. One distinguished feature of the tensor time series factor model used is that there can be weak factors in the factor loadings matrix for each mode. This reflects reality better when real data can have weak factors which drive only groups of observed variables, for instance, a sector factor in financial market driving only stocks in a particular sector. Using the data with missing entries, asymptotic normality is derived for rows of estimated factor loadings, while consistent covariance matrix estimation enables us to carry out inferences. As a first in the literature, we also propose a ratio-based estimator for the rank of the core tensor under general missing patterns. Rates of convergence are spelt out for the imputations from the estimated tensor factor models. We introduce a new measure for gauging imputation performances, and simulation results show that our imputation procedure works well, with asymptotic normality and corresponding inferences also demonstrated. Re-imputation performances are also gauged when we demonstrate that using slightly larger rank then estimated gives superior re-imputation performances. An NYC taxi traffic data set is also analyzed by imposing general missing patterns and gauging the imputation performances.
翻译:我们提出了一种张量时间序列插补方法,适用于张量数据中缺失模式具有一般性的场景,只需满足沿张量纤维轴的任意两个数据位置在足够多的时间点上均被观测即可。该方法基于张量时间序列因子模型,对共同分量采用Tucker分解。该张量时间序列因子模型的一个显著特点是:每个模态的因子载荷矩阵可能存在弱因子。这更符合现实情况,因为真实数据中可能存在仅驱动部分观测变量组的弱因子,例如金融市场中仅驱动特定板块股票的行业因子。利用含缺失项的数据,我们推导了估计因子载荷矩阵各行向量的渐近正态性,同时通过一致协方差矩阵估计实现统计推断。作为文献中的首创,我们还提出了一种基于比率的核心张量秩估计器,适用于一般缺失模式。研究给出了估计张量因子模型插补值的收敛速率。我们引入了一种衡量插补性能的新指标,模拟结果表明本文提出的插补方法表现良好,同时验证了渐近正态性及相应推断的有效性。通过论证采用略大于估计值的秩进行重插补可获得更优的重插补性能,进一步评估了重插补效果。最后,通过设定一般缺失模式分析纽约出租车交通数据集,评估了插补性能。