Comprehensive evaluation of Mal-API-2019 dataset by machine learning in malware detection

This study conducts a thorough examination of malware detection using machine learning techniques, focusing on the evaluation of various classification models using the Mal-API-2019 dataset. The aim is to advance cybersecurity capabilities by identifying and mitigating threats more effectively. Both ensemble and non-ensemble machine learning methods, such as Random Forest, XGBoost, K Nearest Neighbor (KNN), and Neural Networks, are explored. Special emphasis is placed on the importance of data pre-processing techniques, particularly TF-IDF representation and Principal Component Analysis, in improving model performance. Results indicate that ensemble methods, particularly Random Forest and XGBoost, exhibit superior accuracy, precision, and recall compared to others, highlighting their effectiveness in malware detection. The paper also discusses limitations and potential future directions, emphasizing the need for continuous adaptation to address the evolving nature of malware. This research contributes to ongoing discussions in cybersecurity and provides practical insights for developing more robust malware detection systems in the digital era.

翻译：本研究利用机器学习技术对恶意软件检测进行了全面考察，重点基于Mal-API-2019数据集评估了多种分类模型。其目标是通过更有效地识别和缓解威胁来提升网络安全能力。研究探索了集成与非集成两类机器学习方法，包括随机森林、XGBoost、K近邻和神经网络。特别强调了数据预处理技术——尤其是TF-IDF表示和主成分分析——在提升模型性能方面的重要性。结果表明，集成方法（特别是随机森林和XGBoost）在准确性、精确率和召回率方面均优于其他方法，突显了其在恶意软件检测中的有效性。本文还讨论了现有局限性与潜在未来方向，强调需持续适应以应对不断演变的恶意软件。本研究为网络安全领域的持续讨论做出了贡献，并为在数字时代开发更稳健的恶意软件检测系统提供了实践洞察。

相关内容

Machine Learning

关注 2251

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

《用于无线通信和传感的智能反射面 (IRS)》（ICC 2022）新加坡国立大学2022最新53页slides

专知会员服务

26+阅读 · 2022年11月16日

《多范式建模与仿真：系统工程视角》CMU 2022最新24页slides

专知会员服务

59+阅读 · 2022年11月4日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日