Learning to Look Benign: Targeted Evasion of Malware Detectors via API Import Injection

Machine learning-based malware detectors are widely deployed in antivirus and endpoint detection systems, yet their reliance on static features makes them vulnerable to adversarial manipulation. This paper investigates whether a malware sample can be intentionally misclassified as a specific benign software category, not merely as "not malware", by adding a small number of Win32 API imports characteristic of that selected category, without removing any existing imports or retraining the detector. We propose a framework centered on a Conditional Variational Autoencoder (CVAE) whose decoder is strictly additive. It can introduce new API calls but never remove existing ones, preserving malware functionality by design. For each malware sample, the framework automatically identifies which benign category it most closely resembles and uses that as the evasion target. A knowledge-distilled differentiable proxy enables gradient-based training against the non-differentiable ensemble detector. Experiments on a six-class dataset of binary Win32 API import vectors extracted from 3,799 Windows executables (five benign categories, one malware class) show that, against a detector achieving 87.5% malware recall, adding just 20 API imports reduces recall to 30%. At k=20, among samples that evaded detection, 99% are classified as the intended target category. The CVAE outperforms both a frequency-based baseline and random selection at every tested injection size (k = 5 to 50). Validation on real PE files submitted to VirusTotal confirms that the attack transfers to commercial static detection engines, with an average 54.5% reduction in flagging engines. These findings expose a concrete vulnerability in API-based malware classifiers and demonstrate that targeted evasion into a chosen benign category is achievable with minimal, functionality-preserving modifications.

翻译：基于机器学习的恶意软件检测器广泛应用于反病毒和端点检测系统，但其对静态特征的依赖使其容易受到对抗性操纵的影响。本文研究是否可以通过仅添加少量目标良性软件类别特有的Win32 API导入，在不删除任何现有导入或不重新训练检测器的情况下，使恶意软件样本被有意误分类为特定良性软件类别，而不仅仅是“非恶意软件”。我们提出了一种以条件变分自编码器（CVAE）为核心的框架，其解码器是严格加性的——它可以引入新的API调用，但从不删除现有调用，从而在设计中保留恶意软件功能。对于每个恶意软件样本，该框架自动识别其最相似的良性类别，并将其作为规避目标。通过知识蒸馏的可微分代理，该框架能够对不可微分的集成检测器进行基于梯度的训练。在包含3,799个Windows可执行文件（五个良性类别、一个恶意软件类别）的六类二进制Win32 API导入向量数据集上的实验表明，针对召回率为87.5%的恶意软件检测器，仅添加20个API导入即可将召回率降至30%。在k=20时，在规避检测的样本中，99%被分类为预期的目标类别。CVAE在每种测试的注入规模（k=5至50）下均优于基于频率的基线和随机选择方法。在提交至VirusTotal的真实PE文件上进行的验证证实，该攻击可迁移至商业静态检测引擎，平均可将标记引擎数量减少54.5%。这些发现揭示了基于API的恶意软件分类器中存在的具体漏洞，并证明通过极少量保留功能的修改即可实现针对选定良性类别的定向规避。