Sparse-Input Neural Network using Group Concave Regularization

Simultaneous feature selection and non-linear function estimation are challenging, especially in high-dimensional settings where the number of variables exceeds the available sample size in modeling. In this article, we investigate the problem of feature selection in neural networks. Although the group LASSO has been utilized to select variables for learning with neural networks, it tends to select unimportant variables into the model to compensate for its over-shrinkage. To overcome this limitation, we propose a framework of sparse-input neural networks using group concave regularization for feature selection in both low-dimensional and high-dimensional settings. The main idea is to apply a proper concave penalty to the $l_2$ norm of weights from all outgoing connections of each input node, and thus obtain a neural net that only uses a small subset of the original variables. In addition, we develop an effective algorithm based on backward path-wise optimization to yield stable solution paths, in order to tackle the challenge of complex optimization landscapes. Our extensive simulation studies and real data examples demonstrate satisfactory finite sample performances of the proposed estimator, in feature selection and prediction for modeling continuous, binary, and time-to-event outcomes.

翻译：同时进行特征选择与非线性函数估计极具挑战性，尤其在变量数量超过建模可用样本量的高维场景中。本文研究了神经网络中的特征选择问题。尽管群组LASSO已被用于在神经网络学习中筛选变量，但其过度收缩的倾向会导致模型纳入不显著变量以弥补这一缺陷。为克服此局限性，我们提出一种基于群组凹正则化的稀疏输入神经网络框架，适用于低维与高维场景下的特征选择。其核心思想是对每个输入节点所有出射连接权重向量的$l_2$范数施加合适的凹惩罚函数，从而获得仅使用原始特征小子集的神经网络。此外，我们开发了基于反向路径优化的高效算法以生成稳定解路径，应对复杂优化景观带来的挑战。广泛模拟研究与实际数据案例表明，所提估计量在连续型、二值型及时间-事件结局建模的特征选择与预测中具有令人满意的有限样本性能。

相关内容

特征选择

关注 5940

特征选择( Feature Selection )也称特征子集选择( Feature Subset Selection , FSS )，或属性选择( Attribute Selection )。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化，是从原始特征中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤。对于一个学习算法来说,好的学习样本是训练模型的关键。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日