The proliferation of malware, particularly through the use of packing, presents a significant challenge to static analysis and signature-based malware detection techniques. The application of packing to the original executable code renders extracting meaningful features and signatures challenging. To deal with the increasing amount of malware in the wild, researchers and anti-malware companies started harnessing machine learning capabilities with very promising results. However, little is known about the effects of packing on static machine learning-based malware detection and classification systems. This work addresses this gap by investigating the impact of packing on the performance of static machine learning-based models used for malware detection and classification, with a particular focus on those using visualisation techniques. To this end, we present a comprehensive analysis of various packing techniques and their effects on the performance of machine learning-based detectors and classifiers. Our findings highlight the limitations of current static detection and classification systems and underscore the need to be proactive to effectively counteract the evolving tactics of malware authors.
翻译:恶意软件的泛滥,特别是通过打包技术的使用,对静态分析和基于签名的恶意软件检测技术构成了重大挑战。对原始可执行代码应用打包技术使得提取有意义的特征和签名变得困难。为应对野外恶意软件数量的不断增长,研究人员和反恶意软件公司开始利用机器学习能力,并取得了极具前景的成果。然而,关于打包对基于静态机器学习的恶意软件检测与分类系统的影响,目前知之甚少。本研究通过调查打包对用于恶意软件检测与分类的静态机器学习模型性能的影响,特别是那些采用可视化技术的模型,以填补这一空白。为此,我们对各种打包技术及其对基于机器学习的检测器和分类器性能的影响进行了全面分析。我们的研究结果揭示了当前静态检测与分类系统的局限性,并强调需要采取主动措施以有效应对恶意软件作者不断演变的策略。