Machine Learning (ML) techniques can facilitate the automation of malicious software (malware for short) detection, but suffer from evasion attacks. Many studies counter such attacks in heuristic manners, lacking theoretical guarantees and defense effectiveness. In this paper, we propose a new adversarial training framework, termed Principled Adversarial Malware Detection (PAD), which offers convergence guarantees for robust optimization methods. PAD lays on a learnable convex measurement that quantifies distribution-wise discrete perturbations to protect malware detectors from adversaries, whereby for smooth detectors, adversarial training can be performed with theoretical treatments. To promote defense effectiveness, we propose a new mixture of attacks to instantiate PAD to enhance deep neural network-based measurements and malware detectors. Experimental results on two Android malware datasets demonstrate: (i) the proposed method significantly outperforms the state-of-the-art defenses; (ii) it can harden ML-based malware detection against 27 evasion attacks with detection accuracies greater than 83.45%, at the price of suffering an accuracy decrease smaller than 2.16% in the absence of attacks; (iii) it matches or outperforms many anti-malware scanners in VirusTotal against realistic adversarial malware.
翻译:摘要:机器学习(ML)技术可促进恶意软件(简称恶意软件)检测的自动化,但易受规避攻击。许多研究以启发式方式应对此类攻击,缺乏理论保证和防御有效性。本文提出一种新的对抗训练框架——原则性对抗性恶意软件检测(PAD),为鲁棒优化方法提供收敛性保证。PAD基于一个可学习的凸度量,该度量量化分布层面的离散扰动以保护恶意软件检测器免受攻击者侵害,从而使得平滑检测器可在理论支持下进行对抗训练。为提升防御有效性,我们提出一种新的攻击混合体,用于实例化PAD以增强基于深度神经网络的度量与恶意软件检测器。在安卓恶意软件两个数据集上的实验结果表明:(i)所提方法显著优于现有最优防御;(ii)能够强化基于机器学习的恶意软件检测以抵御27种规避攻击,检测准确率高于83.45%,同时在无攻击情况下准确率下降幅度小于2.16%;(iii)在VirusTotal平台上针对现实对抗性恶意软件,其性能匹配或优于多种反恶意软件扫描器。