PYPILINE: Malicious PyPI Package Detection via Suspicious API Knowledge and Agent Workflow - 专知论文

会员服务 ·

0

API · 知识 (knowledge) · Agent · 基 · Analysis ·

PYPILINE: Malicious PyPI Package Detection via Suspicious API Knowledge and Agent Workflow

翻译：暂无翻译

Siyuan Pang,Zhengwei Jiang,Yepeng Yao,Zijing Fan,Haozhe Li,Baoxu Liu

The detection of malicious PyPI packages is crucial for maintaining the security of the open source software supply chain. Existing methods, which primarily rely on rules or traditional machine learning, suffer from poor interpretability and difficulty in adapting to novel attacks. To address this, we propose PYPILINE, a novel detection method that combines a suspicious API knowledge base with an Agent workflow. PYPILINE first conducts static analysis on known malicious packages, extracting abstract syntax trees and generating API call graphs, from which it automatically extracts and constructs a structured suspicious API knowledge base. During the detection phase, this knowledge base is used to enhance reasoning capabilities. Through an Agent workflow, PYPILINE performs in depth semantic analysis of unknown packages and outputs a structured, interpretable maliciousness assessment report. The experimental results show that PYPILINE significantly outperforms existing state-of-the-art tools in precision of 96.7\%, recall of 99.6\%, and F1-score of 98.1\%, with its precision surpassing baseline tools by 5.7 to 24.2 percentage points. Additionally, we conducted an empirical study on malicious packages, systematically revealing prevalent attack strategies, as well as the most commonly abused APIs. Equipped with tool-calling AI agent workflows for automated vector database retrieval of suspicious API knowledge and mail server delivery of analysis reports, PYPILINE delivers a practical, efficient, and convenient malicious package detection solution to strengthen open-source ecosystem security.

翻译：暂无翻译

0

相关内容

API

应用程序接口（简称 API），又称为应用编程接口，就是软件系统不同组成部分衔接的约定。

机密计算保障人工智能系统安全研究报告

机密计算保障人工智能系统安全研究报告

专知会员服务

19+阅读 · 2025年1月20日

大模型安全性，Google DeepMind Nicholas Carlini，附191页slides与视频

大模型安全性，Google DeepMind Nicholas Carlini，附191页slides与视频

专知会员服务

31+阅读 · 2024年7月15日

《软件保障路线图》12页slides，美国国防工业协会

《软件保障路线图》12页slides，美国国防工业协会

专知会员服务

32+阅读 · 2023年8月11日

《集中预训练的联合微调：实现嵌入式硬件上安全和准确的军事安全应用》127页论文

《集中预训练的联合微调：实现嵌入式硬件上安全和准确的军事安全应用》127页论文

专知会员服务

39+阅读 · 2023年4月24日

网络安全行业深度报告：存量改造+数据安全，商密处于上升期

网络安全行业深度报告：存量改造+数据安全，商密处于上升期

专知会员服务

26+阅读 · 2023年1月29日

【CVPR 2022】深度安全多视图聚类:降低因视图增加而导致聚类性能下降的风险，Deep Safe Multi-view Clustering: Reducing the Risk of Clustering Performance Degradation Caused by View Increase

【CVPR 2022】深度安全多视图聚类:降低因视图增加而导致聚类性能下降的风险，Deep Safe Multi-view Clustering: Reducing the Risk of Clustering Performance Degradation Caused by View Increase

专知会员服务

10+阅读 · 2022年3月12日

《人工智能安全框架（2020年）》白皮书，68页pdf

《人工智能安全框架（2020年）》白皮书，68页pdf

专知会员服务

167+阅读 · 2021年1月9日

【教程推荐】可信任深度学习，44页ppt，PDE Based Trustworthy Deep Learning

【教程推荐】可信任深度学习，44页ppt，PDE Based Trustworthy Deep Learning

专知会员服务

37+阅读 · 2020年3月14日

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

专知会员服务

49+阅读 · 2020年2月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【泡泡图灵智库】工业环境中用于表面缺陷检测的全卷积网络

【泡泡图灵智库】工业环境中用于表面缺陷检测的全卷积网络

泡泡机器人SLAM

12+阅读 · 2019年9月21日

Pupy – 全平台远程控制工具

Pupy – 全平台远程控制工具

黑白之道

43+阅读 · 2019年4月26日

【论文笔记和代码梳理】RippleNet：基于知识图谱的用户偏好传播

【论文笔记和代码梳理】RippleNet：基于知识图谱的用户偏好传播

专知

42+阅读 · 2019年4月9日

Github项目推荐 | 比快更快！速度超越OpenCV的人脸检测库 libfacedetection 开源！

Github项目推荐 | 比快更快！速度超越OpenCV的人脸检测库 libfacedetection 开源！

AI研习社

10+阅读 · 2019年3月18日

Packet Sender - 免费的UDP和TCP网络测试实用程序（Android App）

Packet Sender - 免费的UDP和TCP网络测试实用程序（Android App）

黑白之道

25+阅读 · 2019年3月8日

Github项目推荐 | PyTorch 中文手册（pytorch handbook）

Github项目推荐 | PyTorch 中文手册（pytorch handbook）

AI研习社

50+阅读 · 2019年2月18日

用PyTorch做物体检测和追踪

用PyTorch做物体检测和追踪

AI研习社

12+阅读 · 2019年1月6日

自动驾驶功能安全评估：基于仿真的故障注入 | 厚势汽车

自动驾驶功能安全评估：基于仿真的故障注入 | 厚势汽车

厚势

14+阅读 · 2018年9月11日

手把手教 | 深度学习库PyTorch（附代码）

手把手教 | 深度学习库PyTorch（附代码）

数据派THU

27+阅读 · 2018年3月15日

Github 项目推荐 | 用 Pytorch 实现的 Capsule Network

Github 项目推荐 | 用 Pytorch 实现的 Capsule Network

AI研习社

22+阅读 · 2018年3月7日

面向云计算的同态密码关键技术研究

国家自然科学基金

0+阅读 · 2017年12月31日

复杂系统中多密码算法密钥协同安全研究

国家自然科学基金

0+阅读 · 2015年12月31日

针对S芯片验证模块引脚信息的自动分析技术

国家自然科学基金

0+阅读 · 2015年12月31日

面向有源配电网的数据传输优化及智能过滤机制

国家自然科学基金

0+阅读 · 2015年12月31日

面向安全关键系统的时间可预测多核代码生成方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

广义双随机相位编码系统中以QR码为载体的信息加密及无损恢复

国家自然科学基金

0+阅读 · 2015年12月31日

面向主动安全控制的工程车辆动态信息获取与状态辨识

国家自然科学基金

0+阅读 · 2015年12月31日

面向星载综合电子设备的智能BIT关键技术研究

国家自然科学基金

1+阅读 · 2015年12月31日

可与MPSoC高度融合的片上自主测试-自主修复关键技术研究：针对自然、人为可靠性威胁

国家自然科学基金

0+阅读 · 2015年12月31日

复杂需求场景驱动的软件安全防护模型检测技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

AutoTam: Specifying Secure Protocol Implementations with Tamarin Model Generation

Arxiv

0+阅读 · 6月18日

Code-Augur: Agentic Vulnerability Detection via Specification Inference

Arxiv

0+阅读 · 6月17日

Exploring Statistical Change Point Detection Techniques for Performance Anomaly Detection at Mozilla

Arxiv

0+阅读 · 6月16日

SEDULity: A Proof-of-Learning Framework for Distributed and Secure Blockchains with Efficient Useful Work

Arxiv

0+阅读 · 6月10日

RiskFlow: Fast and Faithful Safety-Critical Traffic Scenario Generation

Arxiv

0+阅读 · 6月4日

From Rocq to Metal: A Pipeline for Formally Verified Microcontroller Firmware

Arxiv

0+阅读 · 5月31日

Safety-Critical Adaptive Impedance Control via Nonsmooth Control Barrier Functions under State and Input Constraints

Arxiv

0+阅读 · 5月28日

Model-Agnostic Lifelong LLM Safety via Externalized Attack-Defense Co-Evolution

Arxiv

0+阅读 · 5月13日

AutoSOUP: Safety-Oriented Unit Proof Generation for Component-level Memory-Safety Verification

Arxiv

0+阅读 · 5月11日

A Logic for Hyperproperties in Multi-Agent Systems

Arxiv

10+阅读 · 2022年3月14日

VIP会员

文章信息

相关主题

知识 (knowledge)

最新内容

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

专知会员服务

3+阅读 · 6月22日

综述 | 3D场景图：开放挑战与未来方向

综述 | 3D场景图：开放挑战与未来方向

专知会员服务

3+阅读 · 6月22日

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

专知会员服务

3+阅读 · 6月22日

21世纪的无人机战争

21世纪的无人机战争

专知会员服务

3+阅读 · 6月22日

《伊朗与以色列-美国热战及其对数字技术的影响》

《伊朗与以色列-美国热战及其对数字技术的影响》

专知会员服务

3+阅读 · 6月22日

《量子技术的军事任务技术适配与利用》

《量子技术的军事任务技术适配与利用》

专知会员服务

3+阅读 · 6月22日

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

专知会员服务

4+阅读 · 6月22日

美国从乌克兰无人机战争中学习经验

美国从乌克兰无人机战争中学习经验

专知会员服务

7+阅读 · 6月21日

ICML 2026 | 面向视觉语言模型的语义鲁棒性认证

ICML 2026 | 面向视觉语言模型的语义鲁棒性认证

专知会员服务

5+阅读 · 6月21日

综述 | 智能体电子设计自动化：从“交接有效性”重新理解Agentic EDA

综述 | 智能体电子设计自动化：从“交接有效性”重新理解Agentic EDA

专知会员服务

8+阅读 · 6月21日

深入解读 Palantir AIP：全球最具争议的人工智能平台究竟如何运作

深入解读 Palantir AIP：全球最具争议的人工智能平台究竟如何运作

专知会员服务

21+阅读 · 6月20日

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

专知会员服务

5+阅读 · 6月19日

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

专知会员服务

8+阅读 · 6月19日

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

专知会员服务

7+阅读 · 6月18日

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

专知会员服务

9+阅读 · 6月18日

相关VIP内容

机密计算保障人工智能系统安全研究报告

机密计算保障人工智能系统安全研究报告

专知会员服务

19+阅读 · 2025年1月20日

大模型安全性，Google DeepMind Nicholas Carlini，附191页slides与视频

大模型安全性，Google DeepMind Nicholas Carlini，附191页slides与视频

专知会员服务

31+阅读 · 2024年7月15日

《软件保障路线图》12页slides，美国国防工业协会

《软件保障路线图》12页slides，美国国防工业协会

专知会员服务

32+阅读 · 2023年8月11日

《集中预训练的联合微调：实现嵌入式硬件上安全和准确的军事安全应用》127页论文

《集中预训练的联合微调：实现嵌入式硬件上安全和准确的军事安全应用》127页论文

专知会员服务

39+阅读 · 2023年4月24日

网络安全行业深度报告：存量改造+数据安全，商密处于上升期

网络安全行业深度报告：存量改造+数据安全，商密处于上升期

专知会员服务

26+阅读 · 2023年1月29日

【CVPR 2022】深度安全多视图聚类:降低因视图增加而导致聚类性能下降的风险，Deep Safe Multi-view Clustering: Reducing the Risk of Clustering Performance Degradation Caused by View Increase

【CVPR 2022】深度安全多视图聚类:降低因视图增加而导致聚类性能下降的风险，Deep Safe Multi-view Clustering: Reducing the Risk of Clustering Performance Degradation Caused by View Increase

专知会员服务

10+阅读 · 2022年3月12日

《人工智能安全框架（2020年）》白皮书，68页pdf

《人工智能安全框架（2020年）》白皮书，68页pdf

专知会员服务

167+阅读 · 2021年1月9日

【教程推荐】可信任深度学习，44页ppt，PDE Based Trustworthy Deep Learning

【教程推荐】可信任深度学习，44页ppt，PDE Based Trustworthy Deep Learning

专知会员服务

37+阅读 · 2020年3月14日

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

专知会员服务

49+阅读 · 2020年2月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

综述 | 3D场景图：开放挑战与未来方向

21世纪的无人机战争

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

相关资讯

【泡泡图灵智库】工业环境中用于表面缺陷检测的全卷积网络

【泡泡图灵智库】工业环境中用于表面缺陷检测的全卷积网络

泡泡机器人SLAM

12+阅读 · 2019年9月21日

Pupy – 全平台远程控制工具

Pupy – 全平台远程控制工具

黑白之道

43+阅读 · 2019年4月26日

【论文笔记和代码梳理】RippleNet：基于知识图谱的用户偏好传播

【论文笔记和代码梳理】RippleNet：基于知识图谱的用户偏好传播

专知

42+阅读 · 2019年4月9日

Github项目推荐 | 比快更快！速度超越OpenCV的人脸检测库 libfacedetection 开源！

Github项目推荐 | 比快更快！速度超越OpenCV的人脸检测库 libfacedetection 开源！

AI研习社

10+阅读 · 2019年3月18日

Packet Sender - 免费的UDP和TCP网络测试实用程序（Android App）

Packet Sender - 免费的UDP和TCP网络测试实用程序（Android App）

黑白之道

25+阅读 · 2019年3月8日

Github项目推荐 | PyTorch 中文手册（pytorch handbook）

Github项目推荐 | PyTorch 中文手册（pytorch handbook）

AI研习社

50+阅读 · 2019年2月18日

用PyTorch做物体检测和追踪

用PyTorch做物体检测和追踪

AI研习社

12+阅读 · 2019年1月6日

自动驾驶功能安全评估：基于仿真的故障注入 | 厚势汽车

自动驾驶功能安全评估：基于仿真的故障注入 | 厚势汽车

厚势

14+阅读 · 2018年9月11日

手把手教 | 深度学习库PyTorch（附代码）

手把手教 | 深度学习库PyTorch（附代码）

数据派THU

27+阅读 · 2018年3月15日

Github 项目推荐 | 用 Pytorch 实现的 Capsule Network

Github 项目推荐 | 用 Pytorch 实现的 Capsule Network

AI研习社

22+阅读 · 2018年3月7日

相关论文

AutoTam: Specifying Secure Protocol Implementations with Tamarin Model Generation

Arxiv

0+阅读 · 6月18日

Code-Augur: Agentic Vulnerability Detection via Specification Inference

Arxiv

0+阅读 · 6月17日

Exploring Statistical Change Point Detection Techniques for Performance Anomaly Detection at Mozilla

Arxiv

0+阅读 · 6月16日

SEDULity: A Proof-of-Learning Framework for Distributed and Secure Blockchains with Efficient Useful Work

Arxiv

0+阅读 · 6月10日

RiskFlow: Fast and Faithful Safety-Critical Traffic Scenario Generation

Arxiv

0+阅读 · 6月4日

From Rocq to Metal: A Pipeline for Formally Verified Microcontroller Firmware

Arxiv

0+阅读 · 5月31日

Safety-Critical Adaptive Impedance Control via Nonsmooth Control Barrier Functions under State and Input Constraints

Arxiv

0+阅读 · 5月28日

Model-Agnostic Lifelong LLM Safety via Externalized Attack-Defense Co-Evolution

Arxiv

0+阅读 · 5月13日

AutoSOUP: Safety-Oriented Unit Proof Generation for Component-level Memory-Safety Verification

Arxiv

0+阅读 · 5月11日

A Logic for Hyperproperties in Multi-Agent Systems

Arxiv

10+阅读 · 2022年3月14日

相关基金

面向云计算的同态密码关键技术研究

国家自然科学基金

0+阅读 · 2017年12月31日

复杂系统中多密码算法密钥协同安全研究

国家自然科学基金

0+阅读 · 2015年12月31日

针对S芯片验证模块引脚信息的自动分析技术

国家自然科学基金

0+阅读 · 2015年12月31日

面向有源配电网的数据传输优化及智能过滤机制

国家自然科学基金

0+阅读 · 2015年12月31日

面向安全关键系统的时间可预测多核代码生成方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

广义双随机相位编码系统中以QR码为载体的信息加密及无损恢复

国家自然科学基金

0+阅读 · 2015年12月31日

面向主动安全控制的工程车辆动态信息获取与状态辨识

国家自然科学基金

0+阅读 · 2015年12月31日

面向星载综合电子设备的智能BIT关键技术研究

国家自然科学基金

1+阅读 · 2015年12月31日

可与MPSoC高度融合的片上自主测试-自主修复关键技术研究：针对自然、人为可靠性威胁

国家自然科学基金

0+阅读 · 2015年12月31日

复杂需求场景驱动的软件安全防护模型检测技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员