From Skill Text to Skill Structure: The Scheduling-Structural-Logical Representation for Agent Skills

Large language model (LLM) agents increasingly rely on reusable skills: capability packages that combine instructions, control flow, constraints, and tool calls. In current agent systems, however, skills are still represented by text-heavy artifacts, mainly SKILL{.}md-style documents whose machine-usable evidence remains embedded largely in natural-language descriptions. As a result, skill-centered agent systems face a representation problem: both managing skill collections and using skills during agent execution require reasoning over invocation interfaces, execution structure, and concrete side effects, but these signals are often entangled in a single textual surface. An explicit representation of skill knowledge may therefore help make these artifacts easier for machines to acquire and leverage. Drawing on Memory Organization Packets, Script Theory, and Conceptual Dependency from Schank and Abelson's classical work on cognitive linguistic representation, we introduce what is, to our knowledge, the first structured representation for agent skill artifacts that disentangles skill-level scheduling signals, scene-level execution structure, and logic-level action/resource-use evidence: the Scheduling-Structural-Logical (SSL) representation. We instantiate SSL with an LLM-based normalizer and evaluate SSL-derived representations in two tasks, Skill Discovery and Risk Assessment. The experiment shows that SSL significantly outperforms the text-only baselines: in Skill Discovery, MRR@50 improves from 0.649 to 0.729; in Risk Assessment, macro F1 improves from 0.409 to 0.509. These findings suggest that an explicit, source-grounded structure can make agent skills easier to search and review, positioning SSL as a practical step toward more inspectable, reusable, and operationally actionable skill representations, rather than a finished standard or end-to-end skill-management mechanism.

翻译：大语言模型（LLM）智能体日益依赖可复用的技能：一种将指令、控制流、约束和工具调用融合的能力包。然而在当前智能体系统中，技能仍以文本为主的制品形式呈现，主要是SKILL{.}md风格文档，其机器可用的证据仍主要嵌入自然语言描述中。这导致以技能为中心的智能体系统面临表征问题：管理技能集合与在执行过程中使用技能，都需要对调用接口、执行结构和具体副作用进行推理，但这些信号往往混杂在单一文本表面。因此，明确的技能知识表征或许能帮助这些制品更易于机器获取和利用。借鉴Schank与Abelson在认知语言学表征经典研究中提出的记忆组织包、脚本理论和概念依赖理论，我们提出——据我们所知——首个针对智能体技能制品的结构化表征，该表征将技能级调度信号、场景级执行结构与逻辑级动作/资源使用证据解耦：即调度-结构-逻辑（SSL）表征。我们通过基于LLM的规范化器实例化SSL，并在技能发现和风险评估两项任务中评估基于SSL的表征。实验表明，SSL显著优于纯文本基线：在技能发现中，MRR@50从0.649提升至0.729；在风险评估中，宏F1从0.409提升至0.509。这些发现表明，明确且基于源头的结构可使智能体技能更易于搜索和审查，使SSL成为迈向更具可检视性、可复用性和可操作性的技能表征的实用步骤，而非一项终局标准或端到端技能管理机制。