基于动态程序分析的Node.js包污染流报告优先级学习 (Learning to Triage Taint Flows Reported by Dynamic Program Analysis in Node.js Packages)

Program analysis tools often produce large volumes of candidate vulnerability reports that require costly manual review, creating a practical challenge: how can security analysts prioritize the reports most likely to be true vulnerabilities? This paper investigates whether machine learning can be applied to prioritizing vulnerabilities reported by program analysis tools. We focus on Node.js packages and collect a benchmark of 1,883 Node.js packages, each containing one reported ACE or ACI vulnerability. We evaluate a variety of machine learning approaches, including classical models, graph neural networks (GNNs), large language models (LLMs), and hybrid models that combine GNN and LLMs, trained on data based on a dynamic program analysis tool's output. The top LLM achieves $F_{1} {=} 0.915$, while the best GNN and classical ML models reaching $F_{1} {=} 0.904$. At a less than 7% false-negative rate, the leading model eliminates 66.9% of benign packages from manual review, taking around 60 ms per package. If the best model is tuned to operate at a precision level of 0.8 (i.e., allowing 20% false positives amongst all warnings), our approach can detect 99.2% of exploitable taint flows while missing only 0.8%, demonstrating strong potential for real-world vulnerability triage.

翻译：程序分析工具通常会产生大量候选漏洞报告，这些报告需要耗费大量人力进行手动审查，这带来了一个实际挑战：安全分析师应如何优先处理最可能为真实漏洞的报告？本文研究了机器学习是否可用于对程序分析工具报告的漏洞进行优先级排序。我们聚焦于Node.js包，收集了包含1,883个Node.js包的基准数据集，每个包均含有一个已报告的ACE或ACI漏洞。我们评估了多种机器学习方法，包括经典模型、图神经网络（GNNs）、大语言模型（LLMs）以及结合GNN与LLMs的混合模型，这些模型均基于动态程序分析工具输出的数据进行训练。最优LLM模型达到$F_{1} {=} 0.915$，而最佳GNN与经典ML模型达到$F_{1} {=} 0.904$。在假阴性率低于7%的条件下，领先模型可从手动审查中排除66.9%的良性包，每个包处理时间约60毫秒。若将最佳模型调整至精度为0.8的运行状态（即允许所有警告中存在20%的假阳性），我们的方法能检测99.2%的可利用污染流，仅遗漏0.8%，展现出在实际漏洞分诊中强大的应用潜力。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

31+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日