Contextual Feature Selection with Conditional Stochastic Gates

Feature selection is a crucial tool in machine learning and is widely applied across various scientific disciplines. Traditional supervised methods generally identify a universal set of informative features for the entire population. However, feature relevance often varies with context, while the context itself may not directly affect the outcome variable. Here, we propose a novel architecture for contextual feature selection where the subset of selected features is conditioned on the value of context variables. Our new approach, Conditional Stochastic Gates (c-STG), models the importance of features using conditional Bernoulli variables whose parameters are predicted based on contextual variables. We introduce a hypernetwork that maps context variables to feature selection parameters to learn the context-dependent gates along with a prediction model. We further present a theoretical analysis of our model, indicating that it can improve performance and flexibility over population-level methods in complex feature selection settings. Finally, we conduct an extensive benchmark using simulated and real-world datasets across multiple domains demonstrating that c-STG can lead to improved feature selection capabilities while enhancing prediction accuracy and interpretability.

翻译：特征选择是机器学习中的关键工具，已广泛应用于各类科学领域。传统的监督式方法通常为整个群体识别出一组通用的信息性特征。然而，特征的相关性常随上下文而变化，而上下文本身可能并不直接影响结果变量。本文提出了一种新颖的上下文特征选择架构，其中所选特征的子集以上下文变量的取值为条件。我们提出的新方法——条件随机门控（c-STG）——使用条件伯努利变量对特征重要性进行建模，其参数基于上下文变量进行预测。我们引入了一个超网络，将上下文变量映射到特征选择参数，以学习上下文依赖的门控机制及预测模型。进一步，我们对模型进行了理论分析，表明在复杂的特征选择场景中，该方法相较于群体层面的方法能够提升性能与灵活性。最后，我们在多个领域的模拟和真实数据集上进行了广泛的基准测试，证明c-STG能够提升特征选择能力，同时增强预测准确性与可解释性。

相关内容

特征选择

关注 5940

特征选择( Feature Selection )也称特征子集选择( Feature Subset Selection , FSS )，或属性选择( Attribute Selection )。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化，是从原始特征中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤。对于一个学习算法来说,好的学习样本是训练模型的关键。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日