Large language models (LLMs) have revolutionized AI, but are constrained by limited context windows, hindering their utility in tasks like extended conversations and document analysis. To enable using context beyond limited context windows, we propose virtual context management, a technique drawing inspiration from hierarchical memory systems in traditional operating systems that provide the appearance of large memory resources through data movement between fast and slow memory. Using this technique, we introduce MemGPT (Memory-GPT), a system that intelligently manages different memory tiers in order to effectively provide extended context within the LLM's limited context window, and utilizes interrupts to manage control flow between itself and the user. We evaluate our OS-inspired design in two domains where the limited context windows of modern LLMs severely handicaps their performance: document analysis, where MemGPT is able to analyze large documents that far exceed the underlying LLM's context window, and multi-session chat, where MemGPT can create conversational agents that remember, reflect, and evolve dynamically through long-term interactions with their users. We release MemGPT code and data for our experiments at https://memgpt.ai.
翻译:大语言模型(LLMs)虽推动了人工智能的革命性进展,但其有限的上下文窗口严重制约了在扩展对话与文档分析等任务中的实用性。为突破上下文窗口限制,我们提出虚拟上下文管理技术——该技术借鉴传统操作系统中分层内存系统的设计理念,通过数据在快速与慢速内存间的迁移机制,实现大容量内存资源的虚拟化呈现。基于此技术,我们构建了MemGPT(Memory-GPT)系统,通过智能管理多级内存层级,在LLM有限的上下文窗口内有效提供扩展上下文;同时利用中断机制协调系统与用户间的控制流。我们在两个当前LLMs因上下文窗口局限而表现欠佳的领域进行评估:文档分析中,MemGPT能解析远超底层LLM上下文窗口容量的超长文档;多轮对话场景下,MemGPT可创建具备长期交互记忆、反思能力与动态进化特性的聊天代理。实验相关代码与数据已在https://memgpt.ai开源发布。