SkillGuard: A Permission Framework for Agent Skills

Agent skills extend LLM agents with reusable instructions, scripts, tool bindings, and contextual dependencies. However, current skill ecosystems largely rely on trust-based loading and static inspection, leaving a gap between what a skill can inject into an agent's context and what it can cause the agent to do at runtime. This gap introduces new security and privacy risks, and existing defenses primarily inspect skill files statically or regulate individual tool calls, without systematically connecting a skill's declared intent with its runtime behavior. In this paper, we present SkillGuard, a skill-centric permission framework that treats skills as permission-bearing executable artifacts. SkillGuard introduces a dual-plane governance model that jointly regulates context influence and action side effects through skill manifests, runtime access control, user-mediated authorization, deny-by-default enforcement, capability inference, and behavior monitoring. We evaluate SkillGuard on 315 real-world skills and SkillInject. The permission taxonomy covers 99.76% of observed protected objects, and automated manifest generation reaches 91.0% F1. In adversarial evaluations, SkillGuard reduces attack success from 32.37% to 23.02% for contextual injections and from 25.56% to 16.67% for obvious injections, while maintaining benign task utility. These results suggest that SkillGuard, as a skill-centric permission framework, can provide a practical foundation for improving the privacy and security of agent skill ecosystems.

翻译：[translated abstract in Chinese] 智能体技能通过可复用的指令、脚本、工具绑定和上下文依赖扩展了大语言模型智能体的能力。然而，当前的技能生态系统主要依赖于基于信任的加载和静态检查，这使得技能能够注入智能体上下文的内容与其在运行时可能引发的智能体行为之间存在差距。这种差距引入了新的安全与隐私风险，而现有的防御措施主要对技能文件进行静态检查或对单个工具调用进行监管，未能系统性地将技能声明的意图与其运行时行为关联起来。在本文中，我们提出了SkillGuard——一种以技能为中心的权限框架，将技能视为携带权限的可执行工件。SkillGuard引入了一种双平面治理模型，通过技能清单、运行时访问控制、用户中介授权、默认拒绝执行、能力推断以及行为监控，共同规范上下文影响与动作副作用。我们在315个真实世界技能和SkillInject数据集上对SkillGuard进行了评估。其权限分类体系涵盖了99.76%的已观测保护对象，自动化清单生成的F1值达到91.0%。在对抗性评估中，对于上下文注入攻击，攻击成功率从32.37%降至23.02%；对于显式注入攻击，从25.56%降至16.67%，同时保持了良性任务性能。这些结果表明，作为一种以技能为中心的权限框架，SkillGuard能够为提升智能体技能生态系统的隐私与安全性提供实用基础。