MalPurifier: Enhancing Android Malware Detection with Adversarial Purification against Evasion Attacks

Machine learning (ML) has gained significant adoption in Android malware detection to address the escalating threats posed by the rapid proliferation of malware attacks. However, recent studies have revealed the inherent vulnerabilities of ML-based detection systems to evasion attacks. While efforts have been made to address this critical issue, many of the existing defensive methods encounter challenges such as lower effectiveness or reduced generalization capabilities. In this paper, we introduce a novel Android malware detection method, MalPurifier, which exploits adversarial purification to eliminate perturbations independently, resulting in attack mitigation in a light and flexible way. Specifically, MalPurifier employs a Denoising AutoEncoder (DAE)-based purification model to preprocess input samples, removing potential perturbations from them and then leading to correct classification. To enhance defense effectiveness, we propose a diversified adversarial perturbation mechanism that strengthens the purification model against different manipulations from various evasion attacks. We also incorporate randomized "protective noises" onto benign samples to prevent excessive purification. Furthermore, we customize a loss function for improving the DAE model, combining reconstruction loss and prediction loss, to enhance feature representation learning, resulting in accurate reconstruction and classification. Experimental results on two Android malware datasets demonstrate that MalPurifier outperforms the state-of-the-art defenses, and it significantly strengthens the vulnerable malware detector against 37 evasion attacks, achieving accuracies over 90.91%. Notably, MalPurifier demonstrates easy scalability to other detectors, offering flexibility and robustness in its implementation.

翻译：机器学习（ML）在Android恶意软件检测领域已获得显著应用，以应对恶意软件攻击快速蔓延带来的日益严峻的威胁。然而，近期研究揭示了基于ML的检测系统在规避攻击面前存在固有脆弱性。尽管已有研究致力于解决这一关键问题，但现有防御方法普遍存在有效性不足或泛化能力下降等挑战。本文提出一种新型Android恶意软件检测方法MalPurifier，通过利用对抗净化独立消除扰动，以轻量灵活的方式实现攻击缓解。具体而言，MalPurifier采用基于去噪自编码器（DAE）的净化模型对输入样本进行预处理，移除其中潜在扰动后实现正确分类。为增强防御效果，我们提出多样化对抗扰动机制，强化净化模型抵御各类规避攻击不同操控的能力。同时引入针对良性样本的随机"保护噪声"，防止过度净化。此外，我们定制了改进DAE模型的损失函数，通过融合重构损失与预测损失增强特征表示学习，从而实现精准重构与分类。在两个Android恶意软件数据集上的实验表明，MalPurifier优于现有最优防御方法，显著强化脆弱恶意软件检测器抵御37种规避攻击的能力，准确率超过90.91%。值得注意的是，MalPurifier易于扩展至其他检测器，在实现中展现出灵活性与鲁棒性。