This paper investigates the effect of the design matrix on the ability (or inability) to estimate a sparse parameter in linear regression. More specifically, we characterize the optimal rate of estimation when the smallest singular value of the design matrix is bounded away from zero. In addition to this information-theoretic result, we provide and analyze a procedure which is simultaneously statistically optimal and computationally efficient, based on soft thresholding the ordinary least squares estimator. Most surprisingly, we show that the Lasso estimator -- despite its widespread adoption for sparse linear regression -- is provably minimax rate-suboptimal when the minimum singular value is small. We present a family of design matrices and sparse parameters for which we can guarantee that the Lasso with any choice of regularization parameter -- including those which are data-dependent and randomized -- would fail in the sense that its estimation rate is suboptimal by polynomial factors in the sample size. Our lower bound is strong enough to preclude the statistical optimality of all forms of the Lasso, including its highly popular penalized, norm-constrained, and cross-validated variants.
翻译:本文研究设计矩阵对线性回归中稀疏参数估计能力(或无能)的影响。具体而言,当设计矩阵的最小奇异值远离零时,我们刻画了最优估计速率。除这一信息论结论外,我们提出并分析了一种基于对普通最小二乘估计量进行软阈值处理的统计最优且计算高效的算法。最令人惊讶的是,我们证明:尽管Lasso估计量在稀疏线性回归中被广泛采用,但当日的最小奇异值较小时,该估计量被证明是极小极大速率次优的。我们给出了一族设计矩阵和稀疏参数,能够保证对于任意正则化参数选择(包括数据依赖和随机化的参数),Lasso都将在估计速率上出现样本量的多项式因子次优性。我们的下界足够强,足以排除所有形式的Lasso(包括其广受欢迎的惩罚型、范数约束型和交叉验证型变体)的统计最优性。