With the increasing number and sophistication of malware attacks, malware detection systems based on machine learning (ML) grow in importance. At the same time, many popular ML models used in malware classification are supervised solutions. These supervised classifiers often do not generalize well to novel malware. Therefore, they need to be re-trained frequently to detect new malware specimens, which can be time-consuming. Our work addresses this problem in a hybrid framework of theoretical Quantum ML, combined with feature selection strategies to reduce the data size and malware classifier training time. The preliminary results show that VQC with XGBoost selected features can get a 78.91% test accuracy on the simulator. The average accuracy for the model trained using the features selected with XGBoost was 74% (+- 11.35%) on the IBM 5 qubits machines.
翻译:随着恶意软件攻击数量及复杂性的不断攀升,基于机器学习(ML)的恶意软件检测系统愈发重要。同时,恶意软件分类中广泛使用的许多流行ML模型均为监督学习方案。这些监督分类器通常难以有效泛化至新型恶意软件,因此需要频繁重新训练以检测新恶意软件样本,这一过程耗时显著。本研究在理论量子ML混合框架中解决该问题,通过结合特征选择策略以缩减数据规模及恶意软件分类器训练时间。初步实验表明,采用XGBoost特征选择的变分量子分类器(VQC)在模拟器上实现了78.91%的测试准确率;而基于XGBoost所选特征训练的模型在IBM五量子比特机器上的平均准确率为74%(±11.35%)。