Given a high-dimensional covariate matrix and a response vector, ridge-regularized sparse linear regression selects a subset of features that explains the relationship between covariates and the response in an interpretable manner. To select the sparsity and robustness of linear regressors, techniques like k-fold cross-validation are commonly used for hyperparameter tuning. However, cross-validation substantially increases the computational cost of sparse regression as it requires solving many mixed-integer optimization problems (MIOs). Additionally, validation metrics often serve as noisy estimators of test set errors, with different hyperparameter combinations leading to models with different noise levels. Therefore, optimizing over these metrics is vulnerable to out-of-sample disappointment, especially in underdetermined settings. To improve upon this state of affairs, we make two key contributions. First, motivated by the generalization theory literature, we propose selecting hyperparameters that minimize a weighted sum of a cross-validation metric and a model's output stability, thus reducing the risk of poor out-of-sample performance. Second, we leverage ideas from the mixed-integer optimization literature to obtain computationally tractable relaxations of k-fold cross-validation metrics and the output stability of regressors, facilitating hyperparameter selection after solving fewer MIOs. These relaxations result in an efficient cyclic coordinate descent scheme, achieving lower validation errors than via traditional methods such as grid search. On synthetic datasets, our confidence adjustment procedure improves out-of-sample performance by 2%-5% compared to minimizing the k-fold error alone. On 13 real-world datasets, our confidence adjustment procedure reduces test set error by 2%, on average.
翻译:给定一个高维协变量矩阵和响应向量,经过岭正则化的稀疏线性回归能够以可解释的方式选择特征子集,以解释协变量与响应之间的关系。为了选择线性回归器的稀疏性和鲁棒性,通常使用如k折交叉验证等技术进行超参数调优。然而,交叉验证显著增加了稀疏回归的计算成本,因为它需要求解许多混合整数优化问题(MIOs)。此外,验证指标通常作为测试集误差的噪声估计量,不同的超参数组合会导致模型具有不同的噪声水平。因此,基于这些指标进行优化容易受到样本外失望的影响,特别是在欠定设置中。为了改进这一现状,我们做出了两个关键贡献。首先,受泛化理论文献的启发,我们提出选择能够最小化交叉验证指标与模型输出稳定性加权和的超参数,从而降低样本外性能不佳的风险。其次,我们利用混合整数优化文献中的思想,获得k折交叉验证指标和回归器输出稳定性的计算上易处理的松弛形式,从而在求解更少的MIOs后促进超参数选择。这些松弛形式产生了一种高效的循环坐标下降方案,其验证误差低于传统方法(如网格搜索)。在合成数据集上,我们的置信度调整程序相比仅最小化k折误差,将样本外性能提高了2%-5%。在13个真实世界数据集上,我们的置信度调整程序平均将测试集误差降低了2%。