Lasso-type estimators are routinely used to estimate high-dimensional time series models. The theoretical guarantees established for Lasso typically require the penalty level to be chosen in a suitable fashion often depending on unknown population quantities. Furthermore, the resulting estimates and the number of variables retained in the model depend crucially on the chosen penalty level. However, there is currently no theoretically founded guidance for this choice in the context of high-dimensional time series. Instead one resorts to selecting the penalty level in an ad hoc manner using, e.g., information criteria or cross-validation. We resolve this problem by considering estimation of the perhaps most commonly employed multivariate time series model, the linear vector autoregressive (VAR) model, and propose a weighted Lasso estimator with penalization chosen in a fully data-driven way. The theoretical guarantees that we establish for the resulting estimation and prediction error match those currently available for methods based on infeasible choices of penalization. We thus provide a first solution for choosing the penalization in high-dimensional time series models.
翻译:Lasso型估计器被常规用于估计高维时间序列模型。针对Lasso方法建立的理论保证通常要求以适当方式选择惩罚参数,而这往往依赖于未知的总体特征。此外,最终估计结果及模型中保留的变量数量严重依赖于所选惩罚参数。然而,在高维时间序列背景下,目前尚无理论依据指导此类参数选择。实际应用中常采用信息准则或交叉验证等启发式方法进行惩罚参数选择。本研究通过考虑最常用的多变量时间序列模型——线性向量自回归(VAR)模型,提出一种完全基于数据驱动方式选择惩罚参数的加权Lasso估计器。我们所建立的估计误差和预测误差的理论保证,与当前基于不可行惩罚参数选择方法所能达到的指标相匹配。由此,本研究为高维时间序列模型的惩罚参数选择提供了首个解决方案。