AI agents that interact with users across multiple sessions require persistent long-term memory to maintain coherent, personalized behavior. Current approaches either rely on flat retrieval-augmented generation (RAG), which loses structural relationships between memories, or use memory compression and vector retrieval that cannot capture the associative structure of multi-session conversations. There are few graph based techniques proposed in the literature, however they still suffer from hub dominated retrieval and poor hierarchical reasoning over evolving memory. We propose GAAMA, a graph-augmented associative memory system that constructs a concept-mediated hierarchical knowledge graph through a three-step pipeline: (1)~verbatim episode preservation from raw conversations, (2)~LLM-based extraction of atomic facts and topic-level concept nodes, and (3)~synthesis of higher-order reflections. The resulting graph uses four node types (episode, fact, reflection, concept) connected by five structural edge types, with concept nodes providing cross-cutting traversal paths that complement semantic similarity. Retrieval combines cosine-similarity-based $k$-nearest neighbor search with edge-type-aware Personalized PageRank (PPR) through an additive scoring function. On the LoCoMo-10 benchmark (1,540 questions across 10 multi-session conversations), GAAMA achieves 78.9\% mean reward, outperforming a tuned RAG baseline (75.0\%), HippoRAG (69.9\%), A-Mem (47.2\%), and Nemori (52.1\%). Ablation analysis shows that augmenting graph-traversal-based ranking (Personalized PageRank) with semantic search consistently improves over pure semantic search on graph nodes (+1.0 percentage point overall).
翻译:跨多会话与用户交互的AI智能体需要持久长期记忆以维持连贯、个性化的行为。当前方法要么依赖丢失记忆间结构关系的平面检索增强生成(RAG),要么采用无法捕捉多会话对话关联结构的记忆压缩与向量检索。文献中虽提出少量基于图的技术,但仍存在中心节点主导型检索问题,且难以对动态演变的记忆进行层次化推理。我们提出GAAMA——一种图增强关联记忆系统,通过三步流水线构建概念中介的层次化知识图谱:(1)原始对话的逐字片段保留,(2)基于大语言模型的原子事实与主题级概念节点抽取,(3)高阶反思的合成。生成的图包含四种节点类型(片段、事实、反思、概念),由五种结构边类型连接,其中概念节点提供跨领域遍历路径以补充语义相似性。检索结合基于余弦相似度的k近邻搜索与边类型感知的个性化PageRank(PPR),通过加性评分函数实现。在LoCoMo-10基准测试(涵盖10组多会话对话的1,540个问题)上,GAAMA获得78.9%的平均奖励,优于经过调优的RAG基线(75.0%)、HippoRAG(69.9%)、A-Mem(47.2%)和Nemori(52.1%)。消融分析表明,用语义搜索增强基于图遍历的排序(个性化PageRank)始终优于在图上进行的纯语义搜索(整体提升1.0个百分点)。