Contextual Feature Selection with Conditional Stochastic Gates

We study the problem of contextual feature selection, where the goal is to learn a predictive function while identifying subsets of informative features conditioned on specific contexts. Towards this goal, we generalize the recently proposed stochastic gates (STG) Yamada et al. [2020] by modeling the probabilistic gates as conditional Bernoulli variables whose parameters are predicted based on the contextual variables. Our new scheme, termed conditional-STG (c-STG), comprises two networks: a hypernetwork that establishes the mapping between contextual variables and probabilistic feature selection parameters and a prediction network that maps the selected feature to the response variable. Training the two networks simultaneously ensures the comprehensive incorporation of context and feature selection within a unified model. We provide a theoretical analysis to examine several properties of the proposed framework. Importantly, our model leads to improved flexibility and adaptability of feature selection and, therefore, can better capture the nuances and variations in the data. We apply c-STG to simulated and real-world datasets, including healthcare, housing, and neuroscience, and demonstrate that it effectively selects contextually meaningful features, thereby enhancing predictive performance and interpretability.

翻译：我们研究了上下文特征选择问题，其目标是在学习预测函数的同时，识别基于特定上下文的信息性子集特征。为实现这一目标，我们对近期提出的随机门（STG）Yamada等人[2020]进行了泛化处理，将概率门建模为条件伯努利变量，其参数根据上下文变量进行预测。我们的新方案称为条件STG（c-STG），由两个网络组成：一个超网络，建立上下文变量与概率特征选择参数之间的映射；一个预测网络，将所选特征映射到响应变量。同时训练这两个网络确保了将上下文与特征选择全面整合到统一模型中。我们提供了理论分析，以考察所提出框架的若干性质。重要的是，我们的模型增强了特征选择的灵活性和适应性，从而能够更好地捕捉数据中的细微差异和变化。我们将c-STG应用于模拟数据集和真实世界数据集（包括医疗保健、住房和神经科学领域），并证明它能够有效地选择具有上下文意义的特征，从而提升预测性能和可解释性。

相关内容

特征选择

关注 5940

特征选择( Feature Selection )也称特征子集选择( Feature Subset Selection , FSS )，或属性选择( Attribute Selection )。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化，是从原始特征中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤。对于一个学习算法来说,好的学习样本是训练模型的关键。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日