SLM: End-to-end Feature Selection via Sparse Learnable Masks

Feature selection has been widely used to alleviate compute requirements during training, elucidate model interpretability, and improve model generalizability. We propose SLM -- Sparse Learnable Masks -- a canonical approach for end-to-end feature selection that scales well with respect to both the feature dimension and the number of samples. At the heart of SLM lies a simple but effective learnable sparse mask, which learns which features to select, and gives rise to a novel objective that provably maximizes the mutual information (MI) between the selected features and the labels, which can be derived from a quadratic relaxation of mutual information from first principles. In addition, we derive a scaling mechanism that allows SLM to precisely control the number of features selected, through a novel use of sparsemax. This allows for more effective learning as demonstrated in ablation studies. Empirically, SLM achieves state-of-the-art results against a variety of competitive baselines on eight benchmark datasets, often by a significant margin, especially on those with real-world challenges such as class imbalance.

翻译：特征选择已被广泛用于降低训练过程中的计算需求、提升模型可解释性以及改善模型泛化能力。本文提出SLM——稀疏可学习掩码（Sparse Learnable Masks）——一种具有良好可扩展性的端到端特征选择规范方法，既能处理高维特征空间，也适用于大规模样本场景。SLM的核心是一个简单但有效的可学习稀疏掩码，该掩码通过学习确定需要选择的特征，并将其融入一个新型目标函数，该函数可被证明能够最大化所选特征与标签之间的互信息（MI）。这一目标函数可通过互信息的二次松弛从基本原理推导得出。此外，我们设计了一种缩放机制，利用sparsemax的创新应用，使SLM能够精确控制所选特征的数量。消融实验表明，该机制有助于实现更高效的学习。在八个基准数据集上，SLM相较于多种具有竞争力的基线方法取得了最先进的结果，尤其在面临类别不平衡等现实挑战的数据集上，性能优势更为显著。

相关内容

特征选择

关注 5940

特征选择( Feature Selection )也称特征子集选择( Feature Subset Selection , FSS )，或属性选择( Attribute Selection )。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化，是从原始特征中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤。对于一个学习算法来说,好的学习样本是训练模型的关键。

【RecSys22教程】多阶段推荐系统的神经重排序，90页ppt

专知会员服务

27+阅读 · 2022年9月30日

如何使用TensorFlow 排序构建推荐系统? How to build a recommendation system using TensorFlow Ranking?

专知会员服务

19+阅读 · 2022年3月13日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

WWW21最新「比较学习」教程，135页PPT阐述从排名数据中学习

专知会员服务

37+阅读 · 2021年4月27日