We analyze the performance of the least absolute shrinkage and selection operator (Lasso) for the linear model when the number of regressors $N$ grows larger keeping the true support size $d$ finite, i.e., the ultra-sparse case. The result is based on a novel treatment of the non-rigorous replica method in statistical physics, which has been applied only to problem settings where $N$ ,$d$ and the number of observations $M$ tend to infinity at the same rate. Our analysis makes it possible to assess the average performance of Lasso with Gaussian sensing matrices without assumptions on the scaling of $N$ and $M$, the noise distribution, and the profile of the true signal. Under mild conditions on the noise distribution, the analysis also offers a lower bound on the sample complexity necessary for partial and perfect support recovery when $M$ diverges as $M = O(\log N)$. The obtained bound for perfect support recovery is a generalization of that given in previous literature, which only considers the case of Gaussian noise and diverging $d$. Extensive numerical experiments strongly support our analysis.
翻译:我们分析了线性模型中最小绝对收缩与选择算子(Lasso)在回归变量数$N$增长而真实支撑集大小$d$保持有限(即超稀疏情形)下的性能。该结果基于对统计物理学中非严格复制方法的新颖处理,该方法此前仅适用于$N$、$d$及观测数$M$同步趋于无穷的问题设置。我们的分析使得无需对$N$和$M$的缩放比例、噪声分布及真实信号轮廓做出假设,即可评估高斯感知矩阵下Lasso的平均性能。在噪声分布的温和条件下,该分析还为$M$以$M = O(\log N)$发散时的部分支撑恢复与完美支撑恢复提供了所需样本复杂度的下界。所得完美支撑恢复的下界是已有文献结果的推广,而已有结果仅考虑了高斯噪声及$d$发散的情形。大量数值实验有力支持了我们的分析。