We review Quasi Maximum Likelihood estimation of factor models for high-dimensional panels of time series. We consider two cases: (1) estimation when no dynamic model for the factors is specified (Bai and Li, 2012, 2016); (2) estimation based on the Kalman smoother and the Expectation Maximization algorithm thus allowing to model explicitly the factor dynamics (Doz et al., 2012, Barigozzi and Luciani, 2019). Our interest is in approximate factor models, i.e., when we allow for the idiosyncratic components to be mildly cross-sectionally, as well as serially, correlated. Although such setting apparently makes estimation harder, we show, in fact, that factor models do not suffer of the curse of dimensionality problem, but instead they enjoy a blessing of dimensionality property. In particular, given an approximate factor structure, if the cross-sectional dimension of the data, $N$, grows to infinity, we show that: (i) identification of the model is still possible, (ii) the mis-specification error due to the use of an exact factor model log-likelihood vanishes. Moreover, if we let also the sample size, $T$, grow to infinity, we can also consistently estimate all parameters of the model and make inference. The same is true for estimation of the latent factors which can be carried out by weighted least-squares, linear projection, or Kalman filtering/smoothing. We also compare the approaches presented with: Principal Component analysis and the classical, fixed $N$, exact Maximum Likelihood approach. We conclude with a discussion on efficiency of the considered estimators
翻译:本文评述了针对高维时间序列面板的因子模型的准最大似然估计方法。我们考虑两种情形:(1)未指定因子动态模型时的估计(Bai and Li, 2012, 2016);(2)基于卡尔曼平滑器和期望最大化算法的估计,从而允许显式建模因子动态(Doz et al., 2012; Barigozzi and Luciani, 2019)。我们的关注点在于近似因子模型,即允许特质成分存在轻微横截面相关及序列相关的情形。尽管这一设定表面上增加了估计难度,但我们证明,因子模型实际上并不受维度灾难问题的困扰,反而享有维度红利特性。具体而言,在给定近似因子结构的情况下,若数据的横截面维度$N$趋于无穷,我们证明:(i)模型仍可被识别,(ii)因使用精确因子模型对数似然而产生的模型设定误差将消失。此外,若令样本量$T$也趋于无穷,我们还能一致地估计所有模型参数并进行推断。这一结论同样适用于潜因子估计,后者可通过加权最小二乘、线性投影或卡尔曼滤波/平滑等方法实现。我们还对所述方法与主成分分析及经典的固定$N$精确最大似然方法进行了比较。最后,我们讨论了所考虑估计量的效率问题。