As deep learning models become widely deployed as components within larger production systems, their individual shortcomings can create system-level vulnerabilities with real-world impact. This paper studies how adversarial attacks targeting an ML component can degrade or bypass an entire production-grade malware detection system, performing a case study analysis of Gmail's pipeline where file-type identification relies on a ML model. The malware detection pipeline in use by Gmail contains a machine learning model that routes each potential malware sample to a specialized malware classifier to improve accuracy and performance. This model, called Magika, has been open sourced. By designing adversarial examples that fool Magika, we can cause the production malware service to incorrectly route malware to an unsuitable malware detector thereby increasing our chance of evading detection. Specifically, by changing just 13 bytes of a malware sample, we can successfully evade Magika in 90% of cases and thereby allow us to send malware files over Gmail. We then turn our attention to defenses, and develop an approach to mitigate the severity of these types of attacks. For our defended production model, a highly resourced adversary requires 50 bytes to achieve just a 20% attack success rate. We implement this defense, and, thanks to a collaboration with Google engineers, it has already been deployed in production for the Gmail classifier.
翻译:随着深度学习模型作为组件被广泛部署于大型生产系统中,其个体缺陷可能引发具有现实影响的系统级漏洞。本文研究了针对机器学习组件的对抗攻击如何使整个生产级恶意软件检测系统性能下降或被绕过,并以Gmail的恶意软件检测流程为案例进行分析,该流程中文件类型识别依赖于机器学习模型。Gmail当前使用的恶意软件检测流程包含一个机器学习模型,该模型将每个潜在恶意软件样本路由至专用恶意软件分类器以提高检测准确性和性能。该模型名为Magika,已开源。通过设计能够欺骗Magika的对抗样本,我们可以导致生产级恶意软件服务将恶意软件错误路由至不合适的检测器,从而增加逃避检测的可能性。具体而言,仅需修改恶意软件样本的13个字节,即可在90%的情况下成功规避Magika,进而实现通过Gmail发送恶意软件文件。随后我们将研究重点转向防御机制,开发了一种降低此类攻击严重性的方法。对于采用防御措施的生产模型,资源充足的攻击者需要修改50个字节才能达到仅20%的攻击成功率。我们已实现该防御方案,并得益于与谷歌工程师的合作,该方案已在Gmail分类器的生产环境中部署。