Model selection in penalized regression critically depends on an accurate assessment of model complexity, commonly quantified through the effective degrees of freedom. While the Lasso admits a simple and unbiased characterization, given by the size of the active set, this property does not extend to adaptive penalization methods, despite the widespread use of this approximation in practice. To solve this issue, in this paper we derive a novel unbiased estimator of the effective degrees of freedom for the Adaptive Lasso within Stein's unbiased risk estimation framework. Our analysis reveals additional terms induced by data-dependent penalization, reflecting the role of adaptive weights and regularization in determining model complexity. We further revisit the Group Lasso, providing an alternative derivation of its degrees of freedom, and extend these results to the Adaptive Group Lasso. Importantly, we characterize the behavior of the degrees of freedom along the regularization path beyond the orthonormal design setting commonly assumed in the literature, providing a new theoretical description of this behavior under general design matrices. By correcting the common misuse of active set size as a proxy for degrees of freedom, our results enable more reliable risk estimation and inference, offering a rigorous foundation for understanding model complexity in adaptive penalized regression.
翻译:在惩罚性回归中,模型选择的关键在于准确评估模型复杂度,通常通过有效自由度量化。虽然Lasso方法能够通过活跃集大小提供简单且无偏的刻画,但这一性质并不适用于自适应惩罚方法——尽管实践中广泛使用该近似。为解决此问题,本文在Stein无偏风险估计框架下,为自适应Lasso推导了有效自由度的新颖无偏估计量。我们的分析揭示了数据依赖惩罚导致的额外项,反映了自适应权重与正则化在确定模型复杂度中的作用。此外,我们重新考察了组Lasso,提供了其自由度的另一种推导,并将这些结果扩展到自适应组Lasso。重要的是,我们刻画了沿着正则化路径的自由度行为,突破了文献中通常假设的正交设计设定,在一般设计矩阵下提供了该行为的新理论描述。通过纠正将活跃集大小作为自由度代理的常见误用,我们的结果实现了更可靠的风险估计与推断,为理解自适应惩罚回归中的模型复杂度提供了严格基础。