Third-party agent skills extend LLM-based agents with instruction files and executable code that run on users' machines. Skills execute with user privileges and are distributed through community registries with minimal vetting, but no ground-truth dataset exists to characterize the resulting threats. We construct the first labeled dataset of malicious agent skills by behaviorally verifying 98,380 skills from two community registries, confirming 157 malicious skills with 632 vulnerabilities. These attacks are not incidental. Malicious skills average 4.03 vulnerabilities across a median of three kill chain phases, and the ecosystem has split into two archetypes: Data Thieves that exfiltrate credentials through supply chain techniques, and Agent Hijackers that subvert agent decision-making through instruction manipulation. A single actor accounts for 54.1\% of confirmed cases through templated brand impersonation. Shadow features, capabilities absent from public documentation, appear in 0\% of basic attacks but 100\% of advanced ones; several skills go further by exploiting the AI platform's own hook system and permission flags. Responsible disclosure led to 93.6\% removal within 30 days. We release the dataset and analysis pipeline to support future work on agent skill security.
翻译:第三方智能体技能通过指令文件与可在用户设备上执行代码的方式扩展了基于大语言模型的智能体。这些技能以用户权限执行,并通过社区注册库以最低审核标准分发,但目前缺乏能够描述相关威胁的真实数据集。我们通过行为验证从两个社区注册库获取的98,380个技能,构建了首个带标签的恶意智能体技能数据集,确认了包含632个漏洞的157个恶意技能。这些攻击并非偶然现象:恶意技能平均在每个技能中位数为三个攻击链阶段内存在4.03个漏洞,且该生态系统已分化为两种典型模式:通过供应链技术窃取凭证的"数据窃取者",以及通过指令操控破坏智能体决策的"智能体劫持者"。单一攻击者通过模板化品牌仿冒手段制造了54.1%的已确认案例。公开文档未记载的隐蔽功能在基础攻击中出现率为0%,在高级攻击中达到100%;部分技能更进一步利用AI平台自身的钩子系统与权限标识进行攻击。通过责任披露,93.6%的恶意技能在30天内被移除。我们公开数据集与分析流程以支持未来智能体技能安全研究。