Multi-objective Binary Coordinate Search for Feature Selection

A supervised feature selection method selects an appropriate but concise set of features to differentiate classes, which is highly expensive for large-scale datasets. Therefore, feature selection should aim at both minimizing the number of selected features and maximizing the accuracy of classification, or any other task. However, this crucial task is computationally highly demanding on many real-world datasets and requires a very efficient algorithm to reach a set of optimal features with a limited number of fitness evaluations. For this purpose, we have proposed the binary multi-objective coordinate search (MOCS) algorithm to solve large-scale feature selection problems. To the best of our knowledge, the proposed algorithm in this paper is the first multi-objective coordinate search algorithm. In this method, we generate new individuals by flipping a variable of the candidate solutions on the Pareto front. This enables us to investigate the effectiveness of each feature in the corresponding subset. In fact, this strategy can play the role of crossover and mutation operators to generate distinct subsets of features. The reported results indicate the significant superiority of our method over NSGA-II, on five real-world large-scale datasets, particularly when the computing budget is limited. Moreover, this simple hyper-parameter-free algorithm can solve feature selection much faster and more efficiently than NSGA-II.

翻译：监督式特征选择方法通过选取适当且简洁的特征集来区分不同类别，然而对于大规模数据集而言，这一过程计算成本极高。因此，特征选择的目标应当同时兼顾最小化所选特征数量和最大化分类准确率（或其他任务性能）。然而，在众多现实世界数据集上，这项关键任务的计算需求极为苛刻，需要一种高效的算法，在有限的适应度评估次数内获得一组最优特征。为此，我们提出了二元多目标坐标搜索（MOCS）算法，用以解决大规模特征选择问题。据我们所知，本文提出的算法是首个多目标坐标搜索算法。在该方法中，我们通过翻转Pareto前沿上候选解的某个变量来生成新个体，这使得我们能够评估每个特征在对应子集中的有效性。实际上，该策略可充当交叉算子和变异算子的角色，以生成不同的特征子集。实验报告显示，在五个真实大规模数据集上，我们的方法显著优于NSGA-II，尤其是在计算预算有限的情况下。此外，这种简单的无超参数算法在求解特征选择问题时，其速度与效率均远超NSGA-II。

相关内容

特征选择

关注 5940

特征选择( Feature Selection )也称特征子集选择( Feature Subset Selection , FSS )，或属性选择( Attribute Selection )。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化，是从原始特征中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤。对于一个学习算法来说,好的学习样本是训练模型的关键。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日