Feature Selection: A perspective on inter-attribute cooperation

from arxiv, This preprint has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this article is published in International Journal of Data Science and Analytics, and is available online at https://doi.org/10.1007/s41060-023-00439-z

High-dimensional datasets depict a challenge for learning tasks in data mining and machine learning. Feature selection is an effective technique in dealing with dimensionality reduction. It is often an essential data processing step prior to applying a learning algorithm. Over the decades, filter feature selection methods have evolved from simple univariate relevance ranking algorithms to more sophisticated relevance-redundancy trade-offs and to multivariate dependencies-based approaches in recent years. This tendency to capture multivariate dependence aims at obtaining unique information about the class from the intercooperation among features. This paper presents a comprehensive survey of the state-of-the-art work on filter feature selection methods assisted by feature intercooperation, and summarizes the contributions of different approaches found in the literature. Furthermore, current issues and challenges are introduced to identify promising future research and development.

翻译：高维数据集对数据挖掘和机器学习中的学习任务构成了挑战。特征选择是处理降维的有效技术，通常是在应用学习算法之前必不可少的数据预处理步骤。几十年来，过滤式特征选择方法已从简单的单变量相关性排序算法，发展到更复杂的相关性-冗余性权衡方法，并直至近年来基于多元依赖关系的方法。这种捕捉多元依赖关系的趋势旨在通过特征间的相互协作获取关于类别的独特信息。本文全面综述了基于特征协作的过滤式特征选择方法的最新研究成果，归纳了文献中不同方法的贡献。此外，本文还介绍了当前存在的问题与挑战，以确定未来有前景的研究与发展方向。

相关内容

特征选择

关注 5940

特征选择( Feature Selection )也称特征子集选择( Feature Subset Selection , FSS )，或属性选择( Attribute Selection )。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化，是从原始特征中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤。对于一个学习算法来说,好的学习样本是训练模型的关键。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

专知会员服务

34+阅读 · 2020年1月15日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日