基于K折交叉验证惩罚的Lasso实现变量选择还是n^{1/2}相合性？ (Does $K$-fold CV based penalty perform variable selection or does it lead to $n^{1/2}$-consistency in Lasso?)

Least absolute shrinkage and selection operator or Lasso, introduced by Tibshirani (1996), is one of the widely used regularization methods in regression. It is observed that the properties of Lasso vary wildly depending on the choice of the penalty parameter. The recent results of Lahiri (2021) suggest that, depending on the nature of the penalty parameter, Lasso can either be variable selection consistent or be $n^{1/2}-$consistent. However, practitioners generally implement Lasso by choosing the penalty parameter in a data-dependent way, the most popular being the $K$-fold cross-validation. In this paper, we explore the variable selection consistency and $n^{1/2}-$consistency of Lasso when the penalty is chosen based on $K$-fold cross-validation with $K$ being fixed. We consider the fixed-dimensional heteroscedastic linear regression model and show that Lasso with $K$-fold cross-validation based penalty is $n^{1/2}-$consistent, but not variable selection consistent. We also establish the $n^{1/2}-$consistency of the $K$-fold cross-validation based penalty as an intermediate result. Additionally, as a consequence of $n^{1/2}-$consistency, we establish the validity of Bootstrap to approximate the distribution of the Lasso estimator based on $K-$fold cross-validation. We validate the Bootstrap approximation in finite samples based on a moderate simulation study. Thus, our results essentially justify the use of $K$-fold cross-validation in practice to draw inferences based on $n^{1/2}-$scaled pivotal quantities in Lasso regression.

翻译：最小绝对收缩与选择算子（Lasso）由Tibshirani（1996）提出，是回归分析中广泛使用的正则化方法之一。研究表明，Lasso的性质随惩罚参数的选择而发生显著变化。Lahiri（2021）的最新结果表明，根据惩罚参数的性质，Lasso可能实现变量选择相合性，也可能达到n^{1/2}相合性。然而，实践者通常通过数据驱动的方式选择惩罚参数来实施Lasso，其中最流行的是K折交叉验证。本文探讨了当惩罚参数基于固定K值的K折交叉验证选择时，Lasso的变量选择相合性与n^{1/2}相合性。我们在固定维度的异方差线性回归模型框架下，证明了基于K折交叉验证惩罚的Lasso具有n^{1/2}相合性，但不具备变量选择相合性。作为中间结果，我们还建立了基于K折交叉验证惩罚本身的n^{1/2}相合性。此外，基于n^{1/2}相合性的推论，我们验证了Bootstrap方法在近似基于K折交叉验证的Lasso估计量分布方面的有效性。通过中等规模的模拟研究，我们在有限样本中验证了Bootstrap近似的可靠性。因此，我们的研究结果从本质上论证了在实践中使用K折交叉验证，基于n^{1/2}尺度化枢轴量进行Lasso回归推断的合理性。