Simultaneous feature selection and non-linear function estimation is challenging in modeling, especially in high-dimensional settings where the number of variables exceeds the available sample size. In this article, we investigate the problem of feature selection in neural networks. Although the group least absolute shrinkage and selection operator (LASSO) has been utilized to select variables for learning with neural networks, it tends to select unimportant variables into the model to compensate for its over-shrinkage. To overcome this limitation, we propose a framework of sparse-input neural networks using group concave regularization for feature selection in both low-dimensional and high-dimensional settings. The main idea is to apply a proper concave penalty to the $l_2$ norm of weights from all outgoing connections of each input node, and thus obtain a neural net that only uses a small subset of the original variables. In addition, we develop an effective algorithm based on backward path-wise optimization to yield stable solution paths, in order to tackle the challenge of complex optimization landscapes. We provide a rigorous theoretical analysis of the proposed framework, establishing finite-sample guarantees for both variable selection consistency and prediction accuracy. These results are supported by extensive simulation studies and real data applications, which demonstrate the finite-sample performance of the estimator in feature selection and prediction across continuous, binary, and time-to-event outcomes.
翻译:在建模过程中,同时进行特征选择和非线性函数估计具有挑战性,尤其是在变量数量超过可用样本量的高维场景下。本文研究了神经网络中的特征选择问题。尽管组最小绝对收缩与选择算子(LASSO)已被用于为神经网络学习选择变量,但其倾向于将不重要的变量选入模型以补偿其过度收缩的缺陷。为克服这一局限,我们提出了一个基于组凹正则化的稀疏输入神经网络框架,用于低维和高维场景下的特征选择。其主要思想是对每个输入节点所有输出连接的权重 $l_2$ 范数施加适当的凹惩罚,从而获得仅使用原始变量小子集的神经网络。此外,我们开发了一种基于后向路径优化策略的有效算法以生成稳定的解路径,从而应对复杂优化景观的挑战。我们对所提框架进行了严格的理论分析,为变量选择一致性和预测精度建立了有限样本保证。这些结果得到了广泛的模拟研究和实际数据应用的支持,证明了该估计器在连续、二分类以及事件时间结局中特征选择和预测方面的有限样本性能。