The transition from monolithic large language models (LLMs) to modular, skill-equipped agents represents a fundamental architectural shift in artificial intelligence deployment. While general-purpose models demonstrate remarkable breadth in declarative knowledge, their utility in autonomous workflows is frequently constrained by insufficient specialized procedural expertise. This report investigates a systematic framework for automated acquisition of high-quality agent skills through mining of open-source repositories on platforms such as GitHub. We focus on the extraction of visualization and educational capabilities from state-of-the-art systems including TheoremExplainAgent and Code2Video, both utilizing the Manim mathematical animation engine. The framework encompasses repository structural analysis, semantic skill identification through dense retrieval, and translation to the standardized SKILL.md format. We demonstrate that systematic extraction from agentic repositories, combined with rigorous security governance and multi-dimensional evaluation metrics, enables scalable acquisition of procedural knowledge that augments LLM capabilities without requiring model retraining. Our analysis reveals that agent-generated educational content can achieve 40\% gains in knowledge transfer efficiency while maintaining pedagogical quality comparable to human-crafted tutorials.
翻译:从单体大型语言模型向模块化、具备技能的智能体转变,代表了人工智能部署的根本性架构变革。尽管通用模型在陈述性知识方面展现出卓越的广度,但其在自主工作流程中的效用常因缺乏足够的专业程序性知识而受限。本报告研究了一种通过挖掘GitHub等平台上的开源仓库来自动获取高质量智能体技能的系统性框架。我们重点从包括TheoremExplainAgent和Code2Video在内的先进系统中提取可视化与教育能力,这两个系统均采用Manim数学动画引擎。该框架涵盖仓库结构分析、通过密集检索实现的语义技能识别,以及向标准化SKILL.md格式的转换。我们证明,通过系统性提取智能体仓库,结合严格的安全治理与多维评估指标,能够实现程序性知识的可扩展获取,从而增强LLM能力而无需模型重新训练。我们的分析表明,智能体生成的教育内容在知识传递效率上可实现40%的提升,同时保持与人工编写教程相当的教学质量。