Malware detection has long been a stage for an ongoing arms race between malware authors and anti-virus systems. Solutions that utilize machine learning (ML) gain traction as the scale of this arms race increases. This trend, however, makes performing attacks directly on ML an attractive prospect for adversaries. We study this arms race from both perspectives in the context of MalConv, a popular convolutional neural network-based malware classifier that operates on raw bytes of files. First, we show that MalConv is vulnerable to adversarial patch attacks: appending a byte-level patch to malware files bypasses detection 94.3% of the time. Moreover, we develop a universal adversarial patch (UAP) attack where a single patch can drop the detection rate in constant time of any malware file that contains it by 80%. These patches are effective even being relatively small with respect to the original file size -- between 2%-8%. As a countermeasure, we then perform window ablation that allows us to apply de-randomized smoothing, a modern certified defense to patch attacks in vision tasks, to raw files. The resulting `smoothed-MalConv' can detect over 80% of malware that contains the universal patch and provides certified robustness up to 66%, outlining a promising step towards robust malware detection. To our knowledge, we are the first to apply universal adversarial patch attack and certified defense using ablations on byte level in the malware field.
翻译:恶意软件检测长期以来一直是恶意软件作者与反病毒系统之间持续军备竞赛的舞台。随着这场竞赛规模的扩大,利用机器学习(ML)的解决方案逐渐受到关注。然而,这一趋势使得直接攻击机器学习成为攻击者的诱人目标。我们以MalConv(一种基于卷积神经网络的流行恶意软件分类器,能直接处理文件的原始字节)为背景,从攻防双方视角研究这场竞赛。首先,我们证明MalConv易受对抗性补丁攻击:将字节级补丁附加到恶意软件文件中,有94.3%的概率可绕过检测。此外,我们开发了一种通用对抗性补丁(UAP)攻击——单个补丁即可在恒定时间内使包含该补丁的任何恶意软件文件的检测率下降80%。即使这些补丁相对于原始文件尺寸较小(仅占2%-8%),其攻击效果依然显著。作为对策,我们随后采用窗口消融方法,将视觉任务中针对补丁攻击的现代认证防御技术——去随机化平滑——应用于原始文件。由此得到的“平滑版MalConv”能检测超过80%包含通用补丁的恶意软件,并提供高达66%的认证鲁棒性,为迈向稳健的恶意软件检测勾勒出具有前景的步骤。据我们所知,我们是首个在恶意软件领域字节级别上应用通用对抗性补丁攻击并利用消融技术实现认证防御的研究。