Binary function classifiers play a crucial role in maintaining the security and integrity of software systems by detecting malicious code and unauthorized modifications. However, machine learning-based classifiers are vulnerable to adversarial attacks that can evade detection. In this study, we present Kelpie, a novel framework for executing mimicry attacks, a stronger type of targeted evasion attacks, on binary function classifiers in a black-box, zero-query setting. Unlike previous approaches that rely on querying the target classifier to refine untargeted evasion attacks, Kelpie leverages code transformations that preserve the functionality of malicious payloads while causing them to be misclassified as we want. Through extensive experimentation, we demonstrate that Kelpie can successfully execute mimicry attacks against six state-of-the-art binary function classifiers representing different model architectures without requiring direct interaction with them. We further validate our approach with a practical demonstration, involving a keylogger and a wiper concealed within benign-looking functions embedded in an application. This work, to our best knowledge, is the first to demonstrate such a mimicry attack in a black-box, zero-query context, raising important questions about the reliability and security of existing machine learning-based binary function classifiers.
翻译:二进制函数分类器通过检测恶意代码和未授权修改,在维护软件系统安全性与完整性中发挥关键作用。然而,基于机器学习的分类器易受可规避检测的对抗性攻击威胁。本研究提出Kelpie——一种在零查询黑盒场景下对二进制函数分类器执行模仿攻击(一种更强的定向规避攻击)的新型框架。与依赖查询目标分类器来优化非定向规避攻击的既有方法不同,Kelpie利用保留恶意载荷功能的代码变换技术,使其按我们期望的方式被错误分类。通过大量实验证明,Kelpie能够针对代表不同模型架构的六种最先进二进制函数分类器成功实施模仿攻击,且无需直接交互。我们进一步通过实践验证(涉及隐藏在应用程序良性函数中的键盘记录器和擦除器)证实该方法有效性。据我们所知,本工作是首个在零查询黑盒场景下展示此类模仿攻击的研究,对现有基于机器学习的二进制函数分类器的可靠性与安全性提出重要质疑。