Feature selection is critical in machine learning to reduce dimensionality and improve model accuracy and efficiency. The exponential growth in feature space dimensionality for modern datasets directly results in ambiguous samples and redundant features, which can severely degrade classification accuracy. Quantum machine learning offers potential advantages for addressing this challenge. In this paper, we propose a novel method, quantum support vector machine feature selection (QSVMF), integrating quantum support vector machines with multi-objective genetic algorithm. QSVMF optimizes multiple simultaneous objectives: maximizing classification accuracy, minimizing selected features and quantum circuit costs, and reducing feature covariance. We apply QSVMF for feature selection on a breast cancer dataset, comparing the performance of QSVMF against classical approaches with the selected features. Experimental results show that QSVMF achieves superior performance. Furthermore, The Pareto front solutions of QSVMF enable analysis of accuracy versus feature set size trade-offs, identifying extremely sparse yet accurate feature subsets. We contextualize the biological relevance of the selected features in terms of known breast cancer biomarkers. This work highlights the potential of quantum-based feature selection to enhance machine learning efficiency and performance on complex real-world data.
翻译:特征选择在机器学习中至关重要,旨在降低数据维度并提升模型准确性与效率。现代数据集特征空间维度的指数级增长直接导致样本模糊性与特征冗余,严重降低分类精度。量子机器学习为解决该挑战提供了潜在优势。本文提出一种融合量子支持向量机与多目标遗传算法的新方法——量子支持向量机特征选择(QSVMF)。该方法同时优化多个目标:最大化分类精度、最小化所选特征数量与量子电路成本、降低特征协方差。我们将其应用于乳腺癌数据集的特征选择,并与经典方法进行性能对比。实验结果表明,QSVMF取得了更优表现。此外,QSVMF的Pareto前沿解集可分析精度与特征集规模的权衡关系,从而识别出极度稀疏却保持高准确率的特征子集。我们从已知乳腺癌生物标志物的角度,解析所选特征的生物学相关性。本研究凸显了基于量子的特征选择在提升复杂真实数据机器学习效率与性能方面的潜力。