Augmenting a smooth cost function with an $\ell_1$ penalty allows analysts to efficiently conduct estimation and variable selection simultaneously in sophisticated models and can be efficiently implemented using proximal gradient methods. However, one drawback of the $\ell_1$ penalty is bias: nonzero parameters are underestimated in magnitude, motivating techniques such as the Adaptive Lasso which endow each parameter with its own penalty coefficient. But it's not clear how these parameter-specific penalties should be set in complex models. In this article, we study the approach of treating the penalty coefficients as additional decision variables to be learned in a \textit{Maximum a Posteriori} manner, developing a proximal gradient approach to joint optimization of these together with the parameters of any differentiable cost function. Beyond reducing bias in estimates, this procedure can also encourage arbitrary sparsity structure via a prior on the penalty coefficients. We compare our method to implementations of specific sparsity structures for non-Gaussian regression on synthetic and real datasets, finding our more general method to be competitive in terms of both speed and accuracy. We then consider nonlinear models for two case studies: COVID-19 vaccination behavior and international refugee movement, highlighting the applicability of this approach to complex problems and intricate sparsity structures.
翻译:在光滑代价函数中增加$\ell_1$惩罚项,使分析人员能够在复杂模型中同时高效地进行参数估计与变量选择,并可通过近端梯度法实现高效计算。然而,$\ell_1$惩罚存在一个缺陷:非零参数的幅值会被低估,这推动了诸如自适应Lasso等技术的发展——该技术为每个参数赋予独立的惩罚系数。但在复杂模型中,如何设定这些参数特定的惩罚系数尚不明确。本文研究将惩罚系数视为额外决策变量、以\textit{最大后验概率}方式学习的方法,提出一种近端梯度法来联合优化这些惩罚系数与任何可微代价函数的参数。除降低估计偏差外,该方法还能通过对惩罚系数的先验设置来促进任意稀疏结构。我们将本方法与针对非高斯回归的特定稀疏结构实现方案在合成及真实数据集上进行比较,发现这种更通用的方法在速度与精度方面均具有竞争力。随后通过两个案例研究——COVID-19疫苗接种行为与国际难民流动——探讨非线性模型中的应用,凸显了该方法对复杂问题及精细稀疏结构的适用性。