Large language models (LLMs) have revolutionized AI, but are constrained by limited context windows, hindering their utility in tasks like extended conversations and document analysis. To enable using context beyond limited context windows, we propose virtual context management, a technique drawing inspiration from hierarchical memory systems in traditional operating systems that provide the appearance of large memory resources through data movement between fast and slow memory. Using this technique, we introduce MemGPT (Memory-GPT), a system that intelligently manages different memory tiers in order to effectively provide extended context within the LLM's limited context window, and utilizes interrupts to manage control flow between itself and the user. We evaluate our OS-inspired design in two domains where the limited context windows of modern LLMs severely handicaps their performance: document analysis, where MemGPT is able to analyze large documents that far exceed the underlying LLM's context window, and multi-session chat, where MemGPT can create conversational agents that remember, reflect, and evolve dynamically through long-term interactions with their users. We release MemGPT code and data for our experiments at https://memgpt.ai.
翻译:大语言模型(LLMs)推动了人工智能的变革,但其有限的上下文窗口制约了它们在长对话、文档分析等任务中的实用性。为突破这一限制,我们提出虚拟上下文管理技术——该技术借鉴传统操作系统中通过快慢存储间的数据移动以提供大内存表象的分层内存系统原理。基于此技术,我们构建MemGPT(Memory-GPT)系统,通过智能管理多级存储层次,在LLM有限的上下文窗口内有效扩展虚拟上下文,并利用中断机制协调系统与用户间的控制流。我们在两个受限于现代LLM上下文窗口的典型场景中评估了这项类操作系统设计:文档分析(MemGPT可处理远超底层LLM上下文窗口的大型文档)和多轮对话(MemGPT能创建具有长期记忆能力、可动态演进的对话智能体)。实验代码与数据已发布于https://memgpt.ai。