Confronting the substantial challenges of malware detection in cybersecurity necessitates solutions that are both robust and adaptable to the ever-evolving threat environment. The paper introduces Meta Learning Malware Detection (MeLeMaD), a novel framework leveraging the adaptability and generalization capabilities of Model-Agnostic Meta-Learning (MAML) for malware detection. MeLeMaD incorporates a novel feature selection technique, Chunk-wise Feature Selection based on Gradient Boosting (CFSGB), tailored for handling large-scale, high-dimensional malware datasets, significantly enhancing the detection efficiency. Two benchmark malware datasets (CIC-AndMal2020 and BODMAS) and a custom dataset (EMBOD) were used for rigorously validating the MeLeMaD, achieving a remarkable performance in terms of key evaluation measures, including accuracy, precision, recall, F1-score, MCC, and AUC. With accuracies of 98.04\% on CIC-AndMal2020 and 99.97\% on BODMAS, MeLeMaD outperforms the state-of-the-art approaches. The custom dataset, EMBOD, also achieves a commendable accuracy of 97.85\%. The results underscore the MeLeMaD's potential to address the challenges of robustness, adaptability, and large-scale, high-dimensional datasets in malware detection, paving the way for more effective and efficient cybersecurity solutions.
翻译:面对网络安全中恶意软件检测的重大挑战,需要能够适应不断演变的威胁环境且具备鲁棒性的解决方案。本文提出了元学习恶意软件检测框架(MeLeMaD),该框架利用模型无关元学习(MAML)的适应性和泛化能力进行恶意软件检测。MeLeMaD引入了一种新颖的特征选择技术——基于梯度提升的分块特征选择(CFSGB),该方法专为处理大规模、高维恶意软件数据集而设计,显著提升了检测效率。研究使用两个基准恶意软件数据集(CIC-AndMal2020 和 BODMAS)和一个自定义数据集(EMBOD)对MeLeMaD进行了严格验证,在准确率、精确率、召回率、F1分数、马修斯相关系数(MCC)和曲线下面积(AUC)等关键评估指标上均取得了显著性能。MeLeMaD在CIC-AndMal2020数据集上达到98.04%的准确率,在BODMAS数据集上达到99.97%的准确率,其性能优于现有最先进方法。在自定义数据集EMBOD上也取得了97.85%的优异准确率。这些结果凸显了MeLeMaD在应对恶意软件检测中鲁棒性、适应性以及大规模高维数据集挑战方面的潜力,为开发更高效、更有效的网络安全解决方案开辟了道路。