Dynamic feature selection, where we sequentially query features to make accurate predictions with a minimal budget, is a promising paradigm to reduce feature acquisition costs and provide transparency into a model's predictions. The problem is challenging, however, as it requires both predicting with arbitrary feature sets and learning a policy to identify valuable selections. Here, we take an information-theoretic perspective and prioritize features based on their mutual information with the response variable. The main challenge is implementing this policy, and we design a new approach that estimates the mutual information in a discriminative rather than generative fashion. Building on our approach, we then introduce several further improvements: allowing variable feature budgets across samples, enabling non-uniform feature costs, incorporating prior information, and exploring modern architectures to handle partial inputs. Our experiments show that our method provides consistent gains over recent methods across a variety of datasets.
翻译:动态特征选择是一种旨在以最小预算顺序查询特征以实现准确预测的前沿范式,它能有效降低特征获取成本并提升模型预测的可解释性。然而该问题极具挑战性,既需要模型具备处理任意特征组合的预测能力,又需要学习一种策略来识别具有价值的特征选择。本文从信息论视角出发,优先选择与响应变量互信息最大的特征。实现该策略的核心难题在于互信息的估计,为此我们提出了一种新方法——以判别式而非生成式的方式估计互信息。在此方法基础上,我们进一步引入了多项改进:允许样本间特征预算可变、支持非均匀特征成本、融合先验信息,并探索采用现代架构处理部分输入。实验表明,在多种数据集上,我们的方法相较于近期方法展现出了持续的性能提升。