Beyond Fact Retrieval: Episodic Memory for RAG with Generative Semantic Workspaces

Large Language Models (LLMs) face fundamental challenges in long-context reasoning: many documents exceed their finite context windows, while performance on texts that do fit degrades with sequence length, necessitating their augmentation with external memory frameworks. Current solutions, which have evolved from retrieval using semantic embeddings to more sophisticated structured knowledge graphs representations for improved sense-making and associativity, are tailored for fact-based retrieval and fail to build the space-time-anchored narrative representations required for tracking entities through episodic events. To bridge this gap, we propose the \textbf{Generative Semantic Workspace} (GSW), a neuro-inspired generative memory framework that builds structured, interpretable representations of evolving situations, enabling LLMs to reason over evolving roles, actions, and spatiotemporal contexts. Our framework comprises an \textit{Operator}, which maps incoming observations to intermediate semantic structures, and a \textit{Reconciler}, which integrates these into a persistent workspace that enforces temporal, spatial, and logical coherence. On the Episodic Memory Benchmark (EpBench) \cite{huet_episodic_2025} comprising corpora ranging from 100k to 1M tokens in length, GSW outperforms existing RAG based baselines by up to \textbf{20\%}. Furthermore, GSW is highly efficient, reducing query-time context tokens by \textbf{51\%} compared to the next most token-efficient baseline, reducing inference time costs considerably. More broadly, GSW offers a concrete blueprint for endowing LLMs with human-like episodic memory, paving the way for more capable agents that can reason over long horizons. Code is available at https://github.com/roychowdhuryresearch/gsw-memory.

翻译：大型语言模型（LLM）在长上下文推理中面临根本性挑战：许多文档超出其有限上下文窗口，而即使文本长度适配窗口，其性能也会随序列长度增加而下降，因此需要借助外部记忆框架进行增强。当前解决方案已从基于语义嵌入的检索演变为更复杂的结构化知识图谱表示，以提升意义构建和关联性，但这些方法专为基于事实的检索设计，无法构建时空锚定的叙事表示以追踪实体在情景事件中的演变。为弥补这一差距，我们提出**生成语义工作区**（GSW），这是一种受神经科学启发的生成式记忆框架，能够构建演化情境的结构化、可解释表示，使LLM能够对动态角色、行为及时空上下文进行推理。该框架包含一个**操作器**（将输入观测映射为中间语义结构）和一个**协调器**（将这些结构整合到持久工作区中，并强化时间、空间与逻辑一致性）。在包含10万至100万标记长度的语料库的情景记忆基准（EpBench）\cite{huet_episodic_2025}上，GSW以最高**20%** 的优势超越现有基于RAG的基线方法。此外，GSW具有高效性，与次优的标记高效基线相比，其查询时上下文标记数减少**51%**，显著降低了推理时间成本。更广泛而言，GSW为赋予LLM类人情景记忆提供了具体蓝图，为构建能够进行长程推理的更智能智能体铺平道路。代码发布于 https://github.com/roychowdhuryresearch/gsw-memory。