Multi-label learning has emerged as a crucial paradigm in data analysis, addressing scenarios where instances are associated with multiple class labels simultaneously. With the growing prevalence of multi-label data across diverse applications, such as text and image classification, the significance of multi-label feature selection has become increasingly evident. This paper presents a novel information-theoretical filter-based multi-label feature selection, called ATR, with a new heuristic function. Incorporating a combinations of algorithm adaptation and problem transformation approaches, ATR ranks features considering individual labels as well as abstract label space discriminative powers. Our experimental studies encompass twelve benchmarks spanning various domains, demonstrating the superiority of our approach over ten state-of-the-art information-theoretical filter-based multi-label feature selection methods across six evaluation metrics. Furthermore, our experiments affirm the scalability of ATR for benchmarks characterized by extensive feature and label spaces. The codes are available at https://github.com/Sadegh28/ATR
翻译:多标签学习已成为数据分析中的一个关键范式,用于处理实例同时关联多个类别标签的场景。随着多标签数据在文本分类、图像分类等多样化应用中的日益普及,多标签特征选择的重要性愈发凸显。本文提出一种新颖的基于信息论过滤器的多标签特征选择方法,称为ATR,并设计了一种新的启发式函数。通过结合算法自适应与问题转换方法,ATR在考虑单个标签的同时,也利用了抽象标签空间的判别能力对特征进行排序。我们在涵盖多个领域的12个基准数据集上开展了实验研究,结果表明,在六个评估指标上,我们的方法优于十种最先进的基于信息论过滤器的多标签特征选择方法。此外,我们的实验验证了ATR在特征空间和标签空间较大的基准数据集上的可扩展性。相关代码已公开于 https://github.com/Sadegh28/ATR。