Query-Free Evasion Attacks Against Machine Learning-Based Malware Detectors with Generative Adversarial Networks

Malware detectors based on machine learning (ML) have been shown to be susceptible to adversarial malware examples. However, current methods to generate adversarial malware examples still have their limits. They either rely on detailed model information (gradient-based attacks), or on detailed outputs of the model - such as class probabilities (score-based attacks), neither of which are available in real-world scenarios. Alternatively, adversarial examples might be crafted using only the label assigned by the detector (label-based attack) to train a substitute network or an agent using reinforcement learning. Nonetheless, label-based attacks might require querying a black-box system from a small number to thousands of times, depending on the approach, which might not be feasible against malware detectors. This work presents a novel query-free approach to craft adversarial malware examples to evade ML-based malware detectors. To this end, we have devised a GAN-based framework to generate adversarial malware examples that look similar to benign executables in the feature space. To demonstrate the suitability of our approach we have applied the GAN-based attack to three common types of features usually employed by static ML-based malware detectors: (1) Byte histogram features, (2) API-based features, and (3) String-based features. Results show that our model-agnostic approach performs on par with MalGAN, while generating more realistic adversarial malware examples without requiring any query to the malware detectors. Furthermore, we have tested the generated adversarial examples against state-of-the-art multimodal and deep learning malware detectors, showing a decrease in detection performance, as well as a decrease in the average number of detections by the anti-malware engines in VirusTotal.

翻译：基于机器学习（ML）的恶意软件检测器已被证明易受对抗性恶意软件样本攻击。然而，当前生成对抗性恶意样本的方法仍存在局限性。这些方法要么依赖详细的模型信息（基于梯度的攻击），要么依赖模型输出的详细信息——如类别概率（基于得分的攻击），而这两种攻击在现实场景中均不可行。另一种替代方案是仅利用检测器输出的标签（基于标签的攻击）来训练替代网络或使用强化学习的智能体。然而，基于标签的攻击可能需要向黑盒系统发起少量至数千次查询（具体取决于方法），这在恶意软件检测场景中可能难以实现。本文提出了一种新颖的无查询方法，用于生成对抗性恶意样本来规避基于ML的恶意软件检测器。为此，我们设计了一个基于GAN的框架，在特征空间中生成与良性可执行文件相似的对抗性恶意样本。为验证方法的适用性，我们将该基于GAN的攻击应用于静态ML恶意软件检测器常用的三类特征：（1）字节直方图特征，（2）基于API的特征，以及（3）基于字符串的特征。结果表明，我们的模型无关方法与MalGAN性能相当，同时能生成更真实的对抗性恶意样本，且无需向恶意软件检测器发起任何查询。此外，我们将生成的对抗性样本应用于最先进的多模态和深度学习恶意软件检测器进行测试，结果显示检测性能下降，且VirusTotal中反恶意软件引擎的平均检测次数也相应减少。