DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified Robustness

Machine Learning (ML) models have been utilized for malware detection for over two decades. Consequently, this ignited an ongoing arms race between malware authors and antivirus systems, compelling researchers to propose defenses for malware-detection models against evasion attacks. However, most if not all existing defenses against evasion attacks suffer from sizable performance degradation and/or can defend against only specific attacks, which makes them less practical in real-world settings. In this work, we develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection. Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables. After showing how DRSM is theoretically robust against attacks with contiguous adversarial bytes, we verify its performance and certified robustness experimentally, where we observe only marginal accuracy drops as the cost of robustness. To our knowledge, we are the first to offer certified robustness in the realm of static detection of malware executables. More surprisingly, through evaluating DRSM against 9 empirical attacks of different types, we observe that the proposed defense is empirically robust to some extent against a diverse set of attacks, some of which even fall out of the scope of its original threat model. In addition, we collected 15.5K recent benign raw executables from diverse sources, which will be made public as a dataset called PACE (Publicly Accessible Collection(s) of Executables) to alleviate the scarcity of publicly available benign datasets for studying malware detection and provide future research with more representative data of the time.

翻译：机器学习（ML）模型用于恶意软件检测已有二十多年历史。这一现状引发了恶意软件作者与反病毒系统之间持续的军备竞赛，迫使研究者针对逃避攻击提出恶意软件检测模型防御方案。然而，现有逃避攻击防御方案大多存在性能显著下降和/或仅能防御特定攻击的缺陷，导致其在实际场景中缺乏实用性。本研究通过重新设计适用于恶意软件检测领域的去随机平滑技术，提出一种认证防御方法DRSM（去随机平滑MalConv）。具体而言，我们提出一种窗口消融方案，在最大程度保留可执行文件局部结构的同时，可证明地限制对抗性字节的影响。在理论层面证明DRSM能抵御连续对抗性字节攻击后，我们通过实验验证其性能与认证鲁棒性，观察到仅以精度微小下降作为鲁棒性代价。据我们所知，这是首个在恶意软件可执行文件静态检测领域实现认证鲁棒性的工作。更令人意外的是，通过评估DRSM对9种不同类型经验性攻击的防御效果，我们观察到该防御方案对多样化攻击具有某种程度的经验鲁棒性，部分攻击甚至超出其原始威胁模型范围。此外，我们从不同来源收集了15,500个近期良性原始可执行文件，将以名为PACE（可公开访问的可执行文件集合）的数据集形式公开，以缓解公开良性数据集在恶意软件检测研究中的稀缺性，并为未来研究提供更具时代代表性的数据。