This paper introduces a smooth method for (structured) sparsity in $\ell_q$ and $\ell_{p,q}$ regularized optimization problems. Optimization of these non-smooth and possibly non-convex problems typically relies on specialized procedures. In contrast, our general framework is compatible with prevalent first-order optimization methods like Stochastic Gradient Descent and accelerated variants without any required modifications. This is accomplished through a smooth optimization transfer, comprising an overparametrization of selected model parameters using Hadamard products and a change of penalties. In the overparametrized problem, smooth and convex $\ell_2$ regularization of the surrogate parameters induces non-smooth and non-convex $\ell_q$ or $\ell_{p,q}$ regularization in the original parametrization. We show that our approach yields not only matching global minima but also equivalent local minima. This is particularly useful in non-convex sparse regularization, where finding global minima is NP-hard and local minima are known to generalize well. We provide a comprehensive overview consolidating various literature strands on sparsity-inducing parametrizations and propose meaningful extensions to existing approaches. The feasibility of our approach is evaluated through numerical experiments, which demonstrate that its performance is on par with or surpasses commonly used implementations of convex and non-convex regularization methods.
翻译:本文针对$\ell_q$和$\ell_{p,q}$正则化优化问题中的(结构化)稀疏性,提出了一种平滑方法。这类非光滑乃至可能非凸问题的优化通常依赖专用程序。相比之下,我们的通用框架与随机梯度下降等主流一阶优化方法及其加速变体兼容,无需任何修改。这是通过平滑优化迁移实现的,具体包括对选定模型参数进行哈达玛积过参数化以及惩罚项的变换。在过参数化问题中,代理参数的平滑凸$\ell_2$正则化会诱导原始参数化中的非光滑非凸$\ell_q$或$\ell_{p,q}$正则化。我们证明该方法不仅能匹配全局最优解,还能产生等价的局部最优解。这在非凸稀疏正则化中尤为实用——此类问题中寻找全局最优解属于NP困难问题,而局部最优解已被证明具有良好的泛化性能。我们通过整合稀疏诱导参数化领域的多种文献脉络提出了综合性框架,并对现有方法进行了有意义的扩展。数值实验验证了该方法的可行性,结果表明其性能与凸/非凸正则化方法的常用实现相当,甚至更优。