CogMem：一种用于大型语言模型中持续多轮推理的认知记忆架构 (CogMem: A Cognitive Memory Architecture for Sustained Multi-Turn Reasoning in Large Language Models)

Large language models (LLMs) excel at single-turn reasoning but often lose accuracy and coherence over extended, multi-turn interactions. Recent evaluations such as TurnBench highlight recurring failure modes-reasoning bias, task drift, hallucination, overconfidence, and memory decay. Current approaches typically append full conversational histories, causing unbounded context growth, higher computational costs, and degraded reasoning efficiency. We introduce CogMem, a cognitively inspired, memory-augmented LLM architecture that supports sustained iterative reasoning through structured, persistent memory. CogMem incorporates three layers: a Long-Term Memory (LTM) that consolidates cross-session reasoning strategies; a Direct Access (DA) memory that maintains session-level notes and retrieves relevant long-term memories; and a Focus of Attention (FoA) mechanism that dynamically reconstructs concise, task-relevant context at each turn. Experiments on TurnBench show that this layered design mitigates reasoning failures, controls context growth, and improves consistency across extended reasoning chains, moving toward more reliable, human-like reasoning in LLMs.

翻译：大型语言模型（LLMs）在单轮推理任务中表现出色，但在扩展的多轮交互中常常丧失准确性与连贯性。TurnBench等近期评估揭示了其反复出现的失效模式——推理偏差、任务漂移、幻觉、过度自信及记忆衰减。现有方法通常直接附加完整的对话历史，导致上下文无限制增长、计算成本增加以及推理效率下降。本文提出CogMem，一种受认知启发的记忆增强型LLM架构，通过结构化、持久化的记忆机制支持持续的迭代推理。CogMem包含三个层次：长期记忆层（LTM），用于整合跨会话的推理策略；直接访问记忆层（DA），负责维护会话级笔记并检索相关的长期记忆；以及注意力焦点机制（FoA），动态地在每一轮重构简洁且任务相关的上下文。在TurnBench上的实验表明，这种分层设计能有效缓解推理失败、控制上下文增长，并提升扩展推理链的一致性，推动LLMs实现更可靠、类人的推理能力。