$l^q$-regularization has been demonstrated to be an attractive technique in machine learning and statistical modeling. It attempts to improve the generalization (prediction) capability of a machine (model) through appropriately shrinking its coefficients. The shape of a $l^q$ estimator differs in varying choices of the regularization order $q$. In particular, $l^1$ leads to the LASSO estimate, while $l^{2}$ corresponds to the smooth ridge regression. This makes the order $q$ a potential tuning parameter in applications. To facilitate the use of $l^{q}$-regularization, we intend to seek for a modeling strategy where an elaborative selection on $q$ is avoidable. In this spirit, we place our investigation within a general framework of $l^{q}$-regularized kernel learning under a sample dependent hypothesis space (SDHS). For a designated class of kernel functions, we show that all $l^{q}$ estimators for $0< q < \infty$ attain similar generalization error bounds. These estimated bounds are almost optimal in the sense that up to a logarithmic factor, the upper and lower bounds are asymptotically identical. This finding tentatively reveals that, in some modeling contexts, the choice of $q$ might not have a strong impact in terms of the generalization capability. From this perspective, $q$ can be arbitrarily specified, or specified merely by other no generalization criteria like smoothness, computational complexity, sparsity, etc..
翻译:$l^q$正则化已被证明是机器学习和统计建模中的一项富有吸引力的技术。它通过适当收缩模型系数来提升模型的泛化(预测)能力。随着正则化阶数$q$取值的不同,$l^q$估计量的形态也会发生变化。特别地,$l^1$对应LASSO估计,而$l^2$对应平滑岭回归。这使得阶数$q$成为应用中的一个潜在调优参数。为便利$l^q$正则化的使用,我们试图寻找一种无需精心选择$q$的建模策略。基于此思路,我们在基于样本依赖假设空间(SDHS)的$l^q$正则化核学习通用框架下展开研究。针对特定类型的核函数,我们证明所有满足$0<q<\infty$的$l^q$估计量均能达到相似的泛化误差界。这些估计界几乎是最优的,即上下界在渐近意义上仅相差一个对数因子。这一发现初步揭示,在某些建模情境下,$q$的选择可能对泛化能力并无显著影响。由此观之,$q$可以任意指定,或仅根据非泛化准则(如平滑性、计算复杂度、稀疏性等)进行设定。