We introduce \emph{Memento-Skills}, a generalist, continually-learnable LLM agent system that functions as an \emph{agent-designing agent}: it autonomously constructs, adapts, and improves task-specific agents through experience. The system is built on a memory-based reinforcement learning framework with \emph{stateful prompts}, where reusable skills (stored as structured markdown files) serve as persistent, evolving memory. These skills encode both behaviour and context, enabling the agent to carry forward knowledge across interactions. Starting from simple elementary skills (like Web search and terminal operations), the agent continually improves via the \emph{Read--Write Reflective Learning} mechanism introduced in \emph{Memento~2}~\cite{wang2025memento2}. In the \emph{read} phase, a behaviour-trainable skill router selects the most relevant skill conditioned on the current stateful prompt; in the \emph{write} phase, the agent updates and expands its skill library based on new experience. This closed-loop design enables \emph{continual learning without updating LLM parameters}, as all adaptation is realised through the evolution of externalised skills and prompts. Unlike prior approaches that rely on human-designed agents, Memento-Skills enables a generalist agent to \emph{design agents end-to-end} for new tasks. Through iterative skill generation and refinement, the system progressively improves its own capabilities. Experiments on the \emph{General AI Assistants} benchmark and \emph{Humanity's Last Exam} demonstrate sustained gains, achieving 26.2\% and 116.2\% relative improvements in overall accuracy, respectively. Code is available at https://github.com/Memento-Teams/Memento-Skills.
翻译:我们提出Memento-Skills,一个通用的、可持续学习的大语言模型智能体系统,它充当“设计智能体的智能体”(agent-designing agent):通过经验自主构建、适配和改进面向特定任务的智能体。该系统基于具有有状态提示(stateful prompts)的、记忆增强的强化学习框架构建,其中可复用的技能(存储为结构化Markdown文件)作为持久演化的记忆。这些技能既编码行为也编码上下文,使智能体能够跨交互传递知识。从简单的基础技能(如网络搜索和终端操作)开始,智能体通过Memento 2引入的读写反思学习(Read-Write Reflective Learning)机制持续改进。在读阶段,一个行为可训练的技能路由器根据当前有状态提示选择最相关的技能;在写阶段,智能体基于新经验更新和扩展其技能库。这种闭环设计实现了“无需更新LLM参数的持续学习”,因为所有适应都通过外部化技能和提示的演化来实现。与依赖人工设计智能体的先前方法不同,Memento-Skills使通用智能体能够端到端地为新任务设计智能体。通过迭代技能生成与优化,该系统逐步提升自身能力。在通用AI助手基准(General AI Assistants)和人类最后一次考试(Humanity's Last Exam)上的实验证明了持续的性能提升,整体准确率分别相对提升26.2%和116.2%。代码已开源:https://github.com/Memento-Teams/Memento-Skills。