We consider the framework of penalized estimation where the penalty term is given by a real-valued polyhedral gauge, which encompasses methods such as LASSO (and many variants thereof such as the generalized LASSO), SLOPE, OSCAR, PACS and others. Each of these estimators can uncover a different structure or ``pattern'' of the unknown parameter vector. We define a general notion of patterns based on subdifferentials and formalize an approach to measure their complexity. For pattern recovery, we provide a minimal condition for a particular pattern to be detected by the procedure with positive probability, the so-called accessibility condition. Using our approach, we also introduce the stronger noiseless recovery condition. For the LASSO, it is well known that the irrepresentability condition is necessary for pattern recovery with probability larger than $1/2$ and we show that the noiseless recovery plays exactly the same role, thereby extending and unifying the irrepresentability condition of the LASSO to a broad class of penalized estimators. We show that the noiseless recovery condition can be relaxed when turning to thresholded penalized estimators, extending the idea of the thresholded LASSO: we prove that the accessibility condition is already sufficient (and necessary) for sure pattern recovery by thresholded penalized estimation provided that the signal of the pattern is large enough. Throughout the article, we demonstrate how our findings can be interpreted through a geometrical lens.
翻译:我们考虑惩罚估计的框架,其中惩罚项由实值多面体规范给出,这涵盖了诸如LASSO(及其众多变体,如广义LASSO)、SLOPE、OSCAR、PACS等方法。每种估计量都能揭示未知参数向量的不同结构或“模式”。我们基于次微分定义了模式的通用概念,并形式化了测量其复杂性的方法。对于模式恢复,我们提出了特定模式被此过程以正概率检测到的最小条件,即所谓的可达性条件。利用我们的方法,我们还引入了更强的无噪声恢复条件。对于LASSO,众所周知,不可表示条件是模式恢复概率大于1/2的必要条件,而我们证明无噪声恢复恰好扮演了相同的角色,从而将LASSO的不可表示条件扩展并统一到一大类惩罚估计量中。我们表明,当转向阈值化惩罚估计量时,无噪声恢复条件可以放宽,这扩展了阈值化LASSO的思想:我们证明,只要模式的信号足够强,可达性条件就已经是阈值化惩罚估计能够确保模式恢复的充分(且必要)条件。在整篇文章中,我们展示了如何从几何角度解释我们的发现。