In this paper, we show how mixed-integer conic optimization can be used to combine feature subset selection with holistic generalized linear models to fully automate the model selection process. Concretely, we directly optimize for the Akaike and Bayesian information criteria while imposing constraints designed to deal with multicollinearity in the feature selection task. Specifically, we propose a novel pairwise correlation constraint that combines the sign coherence constraint with ideas from classical statistical models like Ridge regression and the OSCAR model.
翻译:本文展示了如何利用混合整数锥优化将特征子集选择与整体广义线性模型相结合,从而实现模型选择过程的完全自动化。具体而言,我们在施加旨在处理特征选择任务中多重共线性问题的约束条件的同时,直接优化赤池信息准则和贝叶斯信息准则。特别地,我们提出了一种新颖的成对相关性约束,该约束将符号一致性约束与岭回归及OSCAR模型等经典统计模型的思想相结合。