Common regularization algorithms for linear regression, such as LASSO and Ridge regression, rely on a regularization hyperparameter that balances the tradeoff between minimizing the fitting error and the norm of the learned model coefficients. As this hyperparameter is scalar, it can be easily selected via random or grid search optimizing a cross-validation criterion. However, using a scalar hyperparameter limits the algorithm's flexibility and potential for better generalization. In this paper, we address the problem of linear regression with l2-regularization, where a different regularization hyperparameter is associated with each input variable. We optimize these hyperparameters using a gradient-based approach, wherein the gradient of a cross-validation criterion with respect to the regularization hyperparameters is computed analytically through matrix differential calculus. Additionally, we introduce two strategies tailored for sparse model learning problems aiming at reducing the risk of overfitting to the validation data. Numerical examples demonstrate that our multi-hyperparameter regularization approach outperforms LASSO, Ridge, and Elastic Net regression. Moreover, the analytical computation of the gradient proves to be more efficient in terms of computational time compared to automatic differentiation, especially when handling a large number of input variables. Application to the identification of over-parameterized Linear Parameter-Varying models is also presented.
翻译:线性回归的常用正则化算法(如LASSO和岭回归)依赖于一个正则化超参数,该参数平衡了拟合误差最小化与学习模型系数范数之间的权衡。由于该超参数为标量,可通过随机搜索或网格搜索优化交叉验证准则轻松选取。然而,使用标量超参数限制了算法的灵活性和泛化潜力。本文研究具有l2正则化的线性回归问题,其中每个输入变量关联一个不同的正则化超参数。我们采用梯度法优化这些超参数,通过矩阵微分演算解析计算交叉验证准则对正则化超参数的梯度。此外,针对稀疏模型学习问题,我们引入两种旨在降低验证数据过拟合风险的策略。数值实验表明,我们的多超参数正则化方法优于LASSO、岭回归和弹性网络回归。同时,与自动微分相比,梯度的解析计算在计算时间上更具效率,尤其当处理大量输入变量时。本文还展示了该方法在过参数化线性变参数模型辨识中的应用。