ML models are known to be vulnerable to adversarial query attacks. In these attacks, queries are iteratively perturbed towards a particular class without any knowledge of the target model besides its output. The prevalence of remotely-hosted ML classification models and Machine-Learning-as-a-Service platforms means that query attacks pose a real threat to the security of these systems. To deal with this, stateful defenses have been proposed to detect query attacks and prevent the generation of adversarial examples by monitoring and analyzing the sequence of queries received by the system. Several stateful defenses have been proposed in recent years. However, these defenses rely solely on similarity or out-of-distribution detection methods that may be effective in other domains. In the malware detection domain, the methods to generate adversarial examples are inherently different, and therefore we find that such detection mechanisms are significantly less effective. Hence, in this paper, we present MalProtect, which is a stateful defense against query attacks in the malware detection domain. MalProtect uses several threat indicators to detect attacks. Our results show that it reduces the evasion rate of adversarial query attacks by 80+\% in Android and Windows malware, across a range of attacker scenarios. In the first evaluation of its kind, we show that MalProtect outperforms prior stateful defenses, especially under the peak adversarial threat.
翻译:众所周知,机器学习模型容易受到对抗性查询攻击。在此类攻击中,查询会被迭代扰动至特定类别,而攻击者除了目标模型的输出外无需掌握其任何信息。远程托管的机器学习分类模型及机器学习即服务平台的普及,意味着查询攻击对这些系统的安全性构成了真实威胁。为应对此问题,研究人员提出了基于状态的防御机制,通过监控和分析系统接收的查询序列来检测查询攻击并阻止对抗性样本的生成。近年来,已有多种基于状态的防御方案被提出,但这些方案仅依赖相似性检测或分布外检测方法,这些方法在其他领域可能有效。在恶意软件检测领域,生成对抗性样本的方法具有本质差异,因此我们发现此类检测机制的效力显著降低。为此,本文提出MalProtect——一种针对恶意软件检测领域查询攻击的基于状态防御机制。MalProtect利用多种威胁指标检测攻击。我们的实验结果表明,在安卓和Windows平台的恶意软件检测中,该方案能将对抗性查询攻击的规避率降低80%以上,且适用于多种攻击者场景。通过首次同类评估,我们证实MalProtect优于先前的基于状态防御方案,尤其是在对抗性威胁最严峻的情况下。