Feature selection is a crucial step in data mining to enhance model performance by reducing data dimensionality. However, the increasing dimensionality of collected data exacerbates the challenge known as the "curse of dimensionality", where computation grows exponentially with the number of dimensions. To tackle this issue, evolutionary computational (EC) approaches have gained popularity due to their simplicity and applicability. Unfortunately, the diverse designs of EC methods result in varying abilities to handle different data, often underutilizing and not sharing information effectively. In this paper, we propose a novel approach called PSO-based Multi-task Evolutionary Learning (MEL) that leverages multi-task learning to address these challenges. By incorporating information sharing between different feature selection tasks, MEL achieves enhanced learning ability and efficiency. We evaluate the effectiveness of MEL through extensive experiments on 22 high-dimensional datasets. Comparing against 24 EC approaches, our method exhibits strong competitiveness. Additionally, we have open-sourced our code on GitHub at https://github.com/wangxb96/MEL.
翻译:特征选择是数据挖掘中的关键步骤,通过降低数据维度来提升模型性能。然而,收集到的数据维度日益增加,加剧了"维度灾难"这一挑战——计算量随维数呈指数级增长。为解决该问题,进化计算方法因其简单性和适用性而受到广泛关注。但不同进化方法的设计差异导致其处理各类数据的能力各不相同,往往未能充分利用信息且缺乏有效共享。本文提出一种基于粒子群优化的多任务进化学习(MEL)新方法,通过多任务学习机制应对上述挑战。通过整合不同特征选择任务间的信息共享,MEL实现了更强的学习能力与更高效率。我们在22个高维数据集上进行了大量实验验证MEL的有效性。与24种进化计算方法相比,本方法展现出显著竞争力。此外,我们已在GitHub平台(https://github.com/wangxb96/MEL)开源相关代码。