Code obfuscation is widely adopted in modern software development to protect intellectual property and hinder reverse engineering, but it also provides attackers with a powerful means to conceal malicious logic inside otherwise legitimate JavaScript code. In a software supply chain where a single compromised package can affect thousands of applications, this raises a critical question: how robust are the Static Application Security Testing (SAST) tools that CI/CD pipelines rely on as automated security gatekeepers? This paper answers that question by empirically quantifying the impact of JavaScript obfuscation on state-of-practice SAST. We define a realistic supply-chain threat model in which an adversary injects vulnerable code and iteratively obfuscates it until the pipeline reports a clean scan. To measure the resulting degradation, we introduce the Vulnerability Detection Loss (VDL) metric and conduct a two-phase study. First, we analyze 16 vulnerable-by-design Node.js web applications from the OWASP directory; second, we extend the analysis to 260 in-the-wild JavaScript/Node.js projects from GitHub. Across both datasets, we apply eight semantics-preserving obfuscation techniques and their combinations and evaluate two representative SAST tools, Njsscan and Bearer. Even a single obfuscation technique typically suppresses most baseline findings, including high-severity issues, while stacking techniques yield near-total evasion, with VDL often approaching 100%. Our results show that current JavaScript SAST is fundamentally not robust against commonplace obfuscations and that "clean" reports on obfuscated code may offer only a false sense of security. Finally, we discuss practical mitigation guidelines and directions for obfuscation-aware analysis.
翻译:代码混淆技术在现代软件开发中被广泛用于保护知识产权和阻碍逆向工程,但同时也为攻击者提供了在看似正常的JavaScript代码中隐藏恶意逻辑的强大手段。在软件供应链中,单个受感染的软件包可能影响数千个应用程序,这引发了一个关键问题:CI/CD流水线依赖作为自动化安全门卫的静态应用安全测试(SAST)工具究竟有多强健?本文通过实证量化JavaScript混淆对当前实践SAST的影响来回答这一问题。我们定义了一个现实的供应链威胁模型,其中攻击者注入漏洞代码并迭代混淆,直到流水线报告扫描结果为“干净”。为衡量由此导致的性能退化,我们引入了漏洞检测损失(VDL)指标,并开展了两阶段研究:首先分析OWASP目录中16个故意存在漏洞的Node.js Web应用;其次将分析扩展至GitHub上260个真实JavaScript/Node.js项目。在两个数据集上,我们应用了八种语义保持混淆技术及其组合,并评估了两个代表性SAST工具Njsscan和Bearer。结果显示:即使单一混淆技术通常也能抑制大多数基线检测结果(包括高严重性问题),而组合技术则能达到近乎完全规避的效果,VDL通常接近100%。我们的研究表明,当前JavaScript SAST从根本上无法抵御常见混淆技术,针对混淆代码的“干净”报告可能仅提供虚假的安全感。最后,我们讨论了实用的缓解指南和面向混淆感知分析的发展方向。