CrossCommitVuln-Bench: A Dataset of Multi-Commit Python Vulnerabilities Invisible to Per-Commit Static Analysis - 专知论文

会员服务 ·

0

分析 · 数据集 · 静态分析 · Python · 标注 ·

CrossCommitVuln-Bench: A Dataset of Multi-Commit Python Vulnerabilities Invisible to Per-Commit Static Analysis

翻译：CrossCommitVuln-Bench: 多提交Python漏洞数据集——单提交静态分析无法检测的漏洞

Arunabh Majumdar

from arxiv, Accepted at AIware 2026 (3rd ACM International Conference on AI-Powered Software, Montreal, July 6-7, 2026). 4 pages

We present CrossCommitVuln-Bench, a curated benchmark of 15 real-world Python vulnerabilities (CVEs) in which the exploitable condition was introduced across multiple commits - each individually benign to per-commit static analysis - but collectively critical. We manually annotate each CVE with its contributing commit chain, a structured rationale for why each commit evades per-commit analysis, and baseline evaluations using Semgrep and Bandit in both per-commit and cumulative scanning modes. Our central finding: the per-commit detection rate (CCDR) is 13% across all 15 vulnerabilities - 87% of chains are invisible to per-commit SAST. Critically, both per-commit detections are qualitatively poor: one occurs on commits framed as security fixes (where developers suppress the alert), and the other detects only the minor hardcoded-key component while completely missing the primary vulnerability (200+ unprotected API endpoints). Even in cumulative mode (full codebase present), the detection rate is only 27%, confirming that snapshot-based SAST tools often miss vulnerabilities whose introduction spans multiple commits. The dataset, annotation schema, evaluation scripts, and reproducible baselines are released under open-source licenses to support research on cross-commit vulnerability detection.

翻译：我们提出CrossCommitVuln-Bench，这是一个精选的基准数据集，包含15个真实世界的Python漏洞（CVE），其中可利用的条件跨越多个提交引入——每个提交单独对单提交静态分析无害，但组合后构成严重威胁。我们人工为每个CVE标注其贡献提交链、每个提交为何能规避单提交分析的结构化理由，以及使用Semgrep和Bandit在单提交与累积扫描模式下的基线评估。核心发现：单提交检测率（CCDR）在全部15个漏洞中仅为13%——87%的漏洞链对单提交SAST不可见。关键的是，两次单提交检测在质量上均存在问题：一次发生在被标记为安全修复的提交中（开发人员会抑制警报），另一次仅检测到次要的硬编码密钥组件，完全遗漏了主要漏洞（200多个未受保护的API端点）。即使在累积模式（完整代码库存在）下，检测率也仅为27%，证实基于快照的SAST工具经常遗漏跨多个提交引入的漏洞。该数据集、标注架构、评估脚本及可复现基线均以开源许可发布，以支持跨提交漏洞检测研究。

0

相关内容

MMClaw 接入飞书实战：权限、长连接等设置（也适用于OpenClaw）

MMClaw 接入飞书实战：权限、长连接等设置（也适用于OpenClaw）

专知会员服务

15+阅读 · 2月14日

CVPR 2023｜打破CAM的局限性！ToCo：进一步激发 ViT 在弱监督语义分割的潜力

CVPR 2023｜打破CAM的局限性！ToCo：进一步激发 ViT 在弱监督语义分割的潜力

专知会员服务

21+阅读 · 2023年3月31日

ICCV'21 Oral｜拒绝调参，显著提点！检测分割任务的新损失函数RS Loss开源

专知会员服务

16+阅读 · 2021年8月11日

【CVPR 2020-人大】基于层次图推理的细粒度文本视频跨模态检索

【CVPR 2020-人大】基于层次图推理的细粒度文本视频跨模态检索

专知会员服务

67+阅读 · 2020年4月5日

【新书】使用基于python的深度学习开始异常检测，Pytorch与Keras；Beginning Anomaly Detection Using Python-Based Deep Learning

【新书】使用基于python的深度学习开始异常检测，Pytorch与Keras；Beginning Anomaly Detection Using Python-Based Deep Learning

专知会员服务

143+阅读 · 2020年1月13日

【WSDM 2020】RecVAE:一种新的变分自编码器，用于具有隐式反馈的Top-N推荐（RecVAE: a New Variational Autoencoder for Top-NRecommendations with Implicit Feedback）

【WSDM 2020】RecVAE:一种新的变分自编码器，用于具有隐式反馈的Top-N推荐（RecVAE: a New Variational Autoencoder for Top-NRecommendations with Implicit Feedback）

专知会员服务

32+阅读 · 2019年12月26日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

【AAAI2020论文】小样本网络压缩，Few Shot Network Compression via Cross Distillation (附pdf）

专知会员服务

26+阅读 · 2019年11月23日

【ACL 2019 Tutorials】无监督的跨语言表征学习（Unsupervised Cross-Lingual Representation Learning），Sebastian Ruder, Anders Søgaard，Ivan Vulić

【ACL 2019 Tutorials】无监督的跨语言表征学习（Unsupervised Cross-Lingual Representation Learning），Sebastian Ruder, Anders Søgaard，Ivan Vulić

专知会员服务

15+阅读 · 2019年11月17日

【O'Reilly TensorFlow Conference 2019】恶意软件检测（Generative malware outbreak detection），Sean Park | Trend Micro

【O'Reilly TensorFlow Conference 2019】恶意软件检测（Generative malware outbreak detection），Sean Park | Trend Micro

专知会员服务

15+阅读 · 2019年11月13日

GPT-4 让 Python 程序实现自修复 Bug，国外小哥将工具命名为“金刚狼”，并开源！

GPT-4 让 Python 程序实现自修复 Bug，国外小哥将工具命名为“金刚狼”，并开源！

CSDN

11+阅读 · 2023年4月13日

【新书】使用基于python的深度学习开始异常检测，Pytorch与Keras, 427页pdf

【新书】使用基于python的深度学习开始异常检测，Pytorch与Keras, 427页pdf

专知

22+阅读 · 2020年1月16日

Xsser 一款自动检测XSS漏洞工具

Xsser 一款自动检测XSS漏洞工具

黑白之道

14+阅读 · 2019年8月26日

集多种半监督学习范式为一体，谷歌新研究提出新型半监督方法 MixMatch

集多种半监督学习范式为一体，谷歌新研究提出新型半监督方法 MixMatch

机器之心

11+阅读 · 2019年6月3日

语义鸿沟、异构鸿沟、数据缺失，多模态技术如何跨过这些坎？

语义鸿沟、异构鸿沟、数据缺失，多模态技术如何跨过这些坎？

AI前线

15+阅读 · 2019年3月21日

【泡泡图灵智库】Complex-YOLO：一个用于实时点云3D目标检测的欧拉区域提议网络（arXiv）

【泡泡图灵智库】Complex-YOLO：一个用于实时点云3D目标检测的欧拉区域提议网络（arXiv）

泡泡机器人SLAM

20+阅读 · 2018年12月27日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

一次 PyTorch 的踩坑经历，以及如何避免梯度成为NaN

一次 PyTorch 的踩坑经历，以及如何避免梯度成为NaN

AI研习社

14+阅读 · 2017年12月23日

深度文本匹配开源工具（MatchZoo）

深度文本匹配开源工具（MatchZoo）

机器学习研究会

10+阅读 · 2017年12月5日

Tensorflow 文本分类-Python深度学习

Tensorflow 文本分类-Python深度学习

Python程序员

12+阅读 · 2017年11月22日

多标记文本数据流分类方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

满足差分隐私的频繁模式挖掘研究

国家自然科学基金

2+阅读 · 2015年12月31日

非确定型Web服务流程重组的可靠性验证技术

国家自然科学基金

1+阅读 · 2015年12月31日

基于软件定义无线网络的虚拟多径攻击检测与防御技术研究

国家自然科学基金

1+阅读 · 2015年12月31日

连续变量量子密钥分发协议后选择技术研究

国家自然科学基金

1+阅读 · 2015年12月31日

无线传感器网络中高效的虚假数据过滤方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

面向实体信息集成的非合作半结构化深网数据源选择

国家自然科学基金

0+阅读 · 2014年12月31日

多域网络安全的异构策略语义形态与验证机制

国家自然科学基金

0+阅读 · 2014年12月31日

多元数据与函数型数据的序贯检验方法与控制图研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于多源数据的冲沟参数提取及空间尺度转换研究

国家自然科学基金

0+阅读 · 2014年12月31日

BugMagnifier: TON Transaction Simulator for Revealing Smart Contract Vulnerabilities

Arxiv

0+阅读 · 5月1日

Vulnerability Abundance: A formal proof of infinite vulnerabilities in code

Arxiv

0+阅读 · 5月1日

Learning Generalizable Multimodal Representations for Software Vulnerability Detection

Arxiv

0+阅读 · 4月28日

MAS-SZZ: Multi-Agentic SZZ Algorithm for Vulnerability-Inducing Commit Identification

Arxiv

0+阅读 · 4月27日

TraceScope: Interactive URL Triage via Decoupled Checklist Adjudication

Arxiv

0+阅读 · 4月23日

RankFlow: A Multi-Role Collaborative Reranking Workflow Utilizing Large Language Models

Arxiv

0+阅读 · 4月16日

VulGD: A LLM-Powered Dynamic Open-Access Vulnerability Graph Database

Arxiv

0+阅读 · 4月8日

RuleForge: Automated Generation and Validation for Web Vulnerability Detection at Scale

Arxiv

0+阅读 · 4月2日

VulnScout-C: A Lightweight Transformer for C Code Vulnerability Detection

Arxiv

0+阅读 · 3月30日

Toward Scalable Automated Repository-Level Datasets for Software Vulnerability Detection

Arxiv

0+阅读 · 3月18日

VIP会员

文章信息

相关主题

最新内容

ICML 2026 | 自回归Boltzmann生成器重塑分子采样

ICML 2026 | 自回归Boltzmann生成器重塑分子采样

专知会员服务

1+阅读 · 6月26日

GNN跨域综述：从消息传递到图基础模型

GNN跨域综述：从消息传递到图基础模型

专知会员服务

0+阅读 · 6月26日

无人机自主控制与人工智能：系统性综述

无人机自主控制与人工智能：系统性综述

专知会员服务

11+阅读 · 6月26日

巡飞弹与反无人机系统——现代战场的两大支柱

巡飞弹与反无人机系统——现代战场的两大支柱

专知会员服务

4+阅读 · 6月26日

《打造“黄金舰队”》57页报告

《打造“黄金舰队”》57页报告

专知会员服务

3+阅读 · 6月26日

《北约数字教官网络发展路径》128页报告

《北约数字教官网络发展路径》128页报告

专知会员服务

2+阅读 · 6月26日

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

专知会员服务

7+阅读 · 6月25日

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

专知会员服务

6+阅读 · 6月25日

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

专知会员服务

10+阅读 · 6月25日

网状网络及其在军事领域的运用

网状网络及其在军事领域的运用

专知会员服务

8+阅读 · 6月25日

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

专知会员服务

8+阅读 · 6月25日

无美国参与的欧洲战争方式（万字长文）

无美国参与的欧洲战争方式（万字长文）

专知会员服务

8+阅读 · 6月25日

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

专知会员服务

10+阅读 · 6月25日

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

专知会员服务

9+阅读 · 6月25日

《国防领域敏感性分析白皮书》

《国防领域敏感性分析白皮书》

专知会员服务

9+阅读 · 6月25日

相关VIP内容

MMClaw 接入飞书实战：权限、长连接等设置（也适用于OpenClaw）

MMClaw 接入飞书实战：权限、长连接等设置（也适用于OpenClaw）

专知会员服务

15+阅读 · 2月14日

CVPR 2023｜打破CAM的局限性！ToCo：进一步激发 ViT 在弱监督语义分割的潜力

CVPR 2023｜打破CAM的局限性！ToCo：进一步激发 ViT 在弱监督语义分割的潜力

专知会员服务

21+阅读 · 2023年3月31日

ICCV'21 Oral｜拒绝调参，显著提点！检测分割任务的新损失函数RS Loss开源

专知会员服务

16+阅读 · 2021年8月11日

【CVPR 2020-人大】基于层次图推理的细粒度文本视频跨模态检索

【CVPR 2020-人大】基于层次图推理的细粒度文本视频跨模态检索

专知会员服务

67+阅读 · 2020年4月5日

【新书】使用基于python的深度学习开始异常检测，Pytorch与Keras；Beginning Anomaly Detection Using Python-Based Deep Learning

【新书】使用基于python的深度学习开始异常检测，Pytorch与Keras；Beginning Anomaly Detection Using Python-Based Deep Learning

专知会员服务

143+阅读 · 2020年1月13日

【WSDM 2020】RecVAE:一种新的变分自编码器，用于具有隐式反馈的Top-N推荐（RecVAE: a New Variational Autoencoder for Top-NRecommendations with Implicit Feedback）

【WSDM 2020】RecVAE:一种新的变分自编码器，用于具有隐式反馈的Top-N推荐（RecVAE: a New Variational Autoencoder for Top-NRecommendations with Implicit Feedback）

专知会员服务

32+阅读 · 2019年12月26日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

【AAAI2020论文】小样本网络压缩，Few Shot Network Compression via Cross Distillation (附pdf）

专知会员服务

26+阅读 · 2019年11月23日

【ACL 2019 Tutorials】无监督的跨语言表征学习（Unsupervised Cross-Lingual Representation Learning），Sebastian Ruder, Anders Søgaard，Ivan Vulić

【ACL 2019 Tutorials】无监督的跨语言表征学习（Unsupervised Cross-Lingual Representation Learning），Sebastian Ruder, Anders Søgaard，Ivan Vulić

专知会员服务

15+阅读 · 2019年11月17日

【O'Reilly TensorFlow Conference 2019】恶意软件检测（Generative malware outbreak detection），Sean Park | Trend Micro

【O'Reilly TensorFlow Conference 2019】恶意软件检测（Generative malware outbreak detection），Sean Park | Trend Micro

专知会员服务

15+阅读 · 2019年11月13日

热门VIP内容

开通专知VIP会员享更多权益服务

GNN跨域综述：从消息传递到图基础模型

巡飞弹与反无人机系统——现代战场的两大支柱

ICML 2026 | 自回归Boltzmann生成器重塑分子采样

无人机自主控制与人工智能：系统性综述

相关资讯

GPT-4 让 Python 程序实现自修复 Bug，国外小哥将工具命名为“金刚狼”，并开源！

GPT-4 让 Python 程序实现自修复 Bug，国外小哥将工具命名为“金刚狼”，并开源！

CSDN

11+阅读 · 2023年4月13日

【新书】使用基于python的深度学习开始异常检测，Pytorch与Keras, 427页pdf

【新书】使用基于python的深度学习开始异常检测，Pytorch与Keras, 427页pdf

专知

22+阅读 · 2020年1月16日

Xsser 一款自动检测XSS漏洞工具

Xsser 一款自动检测XSS漏洞工具

黑白之道

14+阅读 · 2019年8月26日

集多种半监督学习范式为一体，谷歌新研究提出新型半监督方法 MixMatch

集多种半监督学习范式为一体，谷歌新研究提出新型半监督方法 MixMatch

机器之心

11+阅读 · 2019年6月3日

语义鸿沟、异构鸿沟、数据缺失，多模态技术如何跨过这些坎？

语义鸿沟、异构鸿沟、数据缺失，多模态技术如何跨过这些坎？

AI前线

15+阅读 · 2019年3月21日

【泡泡图灵智库】Complex-YOLO：一个用于实时点云3D目标检测的欧拉区域提议网络（arXiv）

【泡泡图灵智库】Complex-YOLO：一个用于实时点云3D目标检测的欧拉区域提议网络（arXiv）

泡泡机器人SLAM

20+阅读 · 2018年12月27日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

一次 PyTorch 的踩坑经历，以及如何避免梯度成为NaN

一次 PyTorch 的踩坑经历，以及如何避免梯度成为NaN

AI研习社

14+阅读 · 2017年12月23日

深度文本匹配开源工具（MatchZoo）

深度文本匹配开源工具（MatchZoo）

机器学习研究会

10+阅读 · 2017年12月5日

Tensorflow 文本分类-Python深度学习

Tensorflow 文本分类-Python深度学习

Python程序员

12+阅读 · 2017年11月22日

相关论文

BugMagnifier: TON Transaction Simulator for Revealing Smart Contract Vulnerabilities

Arxiv

0+阅读 · 5月1日

Vulnerability Abundance: A formal proof of infinite vulnerabilities in code

Arxiv

0+阅读 · 5月1日

Learning Generalizable Multimodal Representations for Software Vulnerability Detection

Arxiv

0+阅读 · 4月28日

MAS-SZZ: Multi-Agentic SZZ Algorithm for Vulnerability-Inducing Commit Identification

Arxiv

0+阅读 · 4月27日

TraceScope: Interactive URL Triage via Decoupled Checklist Adjudication

Arxiv

0+阅读 · 4月23日

RankFlow: A Multi-Role Collaborative Reranking Workflow Utilizing Large Language Models

Arxiv

0+阅读 · 4月16日

VulGD: A LLM-Powered Dynamic Open-Access Vulnerability Graph Database

Arxiv

0+阅读 · 4月8日

RuleForge: Automated Generation and Validation for Web Vulnerability Detection at Scale

Arxiv

0+阅读 · 4月2日

VulnScout-C: A Lightweight Transformer for C Code Vulnerability Detection

Arxiv

0+阅读 · 3月30日

Toward Scalable Automated Repository-Level Datasets for Software Vulnerability Detection

Arxiv

0+阅读 · 3月18日

相关基金

多标记文本数据流分类方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

满足差分隐私的频繁模式挖掘研究

国家自然科学基金

2+阅读 · 2015年12月31日

非确定型Web服务流程重组的可靠性验证技术

国家自然科学基金

1+阅读 · 2015年12月31日

基于软件定义无线网络的虚拟多径攻击检测与防御技术研究

国家自然科学基金

1+阅读 · 2015年12月31日

连续变量量子密钥分发协议后选择技术研究

国家自然科学基金

1+阅读 · 2015年12月31日

无线传感器网络中高效的虚假数据过滤方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

面向实体信息集成的非合作半结构化深网数据源选择

国家自然科学基金

0+阅读 · 2014年12月31日

多域网络安全的异构策略语义形态与验证机制

国家自然科学基金

0+阅读 · 2014年12月31日

多元数据与函数型数据的序贯检验方法与控制图研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于多源数据的冲沟参数提取及空间尺度转换研究

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员