Unveiling low-dimensional patterns induced by convex non-differentiable regularizers

Popular regularizers with non-differentiable penalties, such as Lasso, Elastic Net, Generalized Lasso, or SLOPE, reduce the dimension of the parameter space by inducing sparsity or clustering in the estimators' coordinates. In this paper, we focus on linear regression and explore the asymptotic distributions of the resulting low-dimensional patterns when the number of regressors $p$ is fixed, the number of observations $n$ goes to infinity, and the penalty function increases at the rate of $\sqrt{n}$. While the asymptotic distribution of the rescaled estimation error can be derived by relatively standard arguments, the convergence of the pattern does not simply follow from the convergence in distribution, and requires a careful and separate treatment. For this purpose, we use the Hausdorff distance as a suitable mode of convergence for subdifferentials, resulting in the desired pattern convergence. Furthermore, we derive the exact limiting probability of recovering the true model pattern. This probability goes to 1 if and only if the penalty scaling constant diverges to infinity and the regularizer-specific asymptotic irrepresentability condition is satisfied. We then propose simple two-step procedures that asymptotically recover the model patterns, irrespective whether the irrepresentability condition holds. Interestingly, our theory shows that Fused Lasso cannot reliably recover its own clustering pattern, even for independent regressors. It also demonstrates how this problem can be resolved by ``concavifying'' the Fused Lasso penalty coefficients. Additionally, sampling from the asymptotic error distribution facilitates comparisons between different regularizers. We provide short simulation studies showcasing an illustrative comparison between the asymptotic properties of Lasso, Fused Lasso, and SLOPE.

翻译：本文研究线性回归中，当回归变量个数$p$固定、观测数$n$趋于无穷大且惩罚函数以$\sqrt{n}$速率增长时，由Lasso、弹性网络、广义Lasso及SLOPE等具有非光滑惩罚项的流行正则化方法所产生的低维模式的渐近分布。尽管通过相对标准的论证可导出重标估计误差的渐近分布，但模式收敛并非简单源于分布收敛，需进行细致且独立的处理。为此，我们采用Hausdorff距离作为次微分的合适收敛模式，实现了所需的模式收敛。进一步推导出恢复真实模型模式的精确极限概率：当且仅当惩罚缩放常数发散至无穷且满足正则化器特定的渐近不可表示性条件时，该概率趋于1。随后提出两种简单两步法过程，无论不可表示性条件是否成立，均可渐近恢复模型模式。有趣的是，理论表明即使在回归变量独立的情况下，融合Lasso也无法可靠恢复其聚类模式，并展示如何通过"凹化"融合Lasso惩罚系数解决该问题。此外，从渐近误差分布中采样可促进不同正则化器间的比较。我们提供简短模拟研究，直观比较Lasso、融合Lasso和SLOPE的渐近性质。