Persistent Memory Through Triple-Loop Consolidation in a Non-Gradient Dissipative Cognitive Architecture

from arxiv, 28 pages, 7 figures, 6 tables. Submitted to Frontiers in Computational Neuroscience. Ancillary file: dm_minimal_reproduction.py (NumPy-only reproduction script, ~200 lines)

Dissipative cognitive architectures maintain computation through continuous energy expenditure, where units that exhaust their energy are stochastically replaced with fresh random state. This creates a fundamental challenge: how can persistent, context-specific memory survive when all learnable state is periodically destroyed? Existing memory mechanisms -- including elastic weight consolidation, synaptic intelligence, and surprise-driven gating -- rely on gradient computation and are inapplicable to non-gradient dissipative systems. We introduce Deep Memory (DM), a non-gradient persistent memory mechanism operating through a triple-loop consolidation cycle: (1) recording of expert-specific content centroids, (2) seeding of replaced units with stored representations, and (3) stabilization through continuous re-entry. We demonstrate that discrete expert routing via Mixture-of-Experts (MoE) gating is a causal prerequisite for DM, preventing centroid convergence that would render stored memories identical. Across ${\sim}970$ simulation runs spanning thirteen experimental blocks: (i) discrete routing is causally necessary for specialization ($\text{MI}=1.10$ vs. $0.001$; $n=91$); (ii) DM achieves $R=0.984$ vs. $0.385$ without memory ($n=16$); (iii) continuous seeding reconstructs representations after interference ($R_\mathrm{recon}=0.978$; one-shot fails; $n=30$); (iv) the mechanism operates within a characterized $(K,p)$ envelope ($n=350$); (v) recording $\times$ seeding is the minimal critical dyad ($n=40$); (vi) DM outperforms non-gradient baselines (Hopfield, ESN) under matched turnover ($n=370$). These results establish DM as a falsifiable mechanism for persistent memory in non-gradient cognitive systems, with functional parallels to hippocampal consolidation.

翻译：耗散认知架构通过持续的能量消耗维持计算，能量耗尽的单元会被随机替换为全新状态。这带来了一个根本性挑战：当所有可学习状态周期性被破坏时，如何保存持续的、特定于上下文的记忆？现有记忆机制——包括弹性权重巩固、突触智能和惊喜驱动门控——依赖于梯度计算，因此不适用于非梯度耗散系统。我们提出深度记忆(DM)，一种通过三重循环巩固周期运行的非梯度持久记忆机制：(1)记录专家特定内容质心，(2)用存储表征初始化被替换单元，(3)通过持续重入实现稳定化。我们证明，通过混合专家(MoE)门控的离散专家路由是DM的因果前提，可防止质心收敛导致存储记忆趋同。在跨越十三个实验模块的约970次模拟运行中：(i)离散路由是专业化的因果必要条件（互信息1.10 vs. 0.001；n=91）；(ii)DM达到R=0.984，而无记忆基线为0.385（n=16）；(iii)持续初始化可在干扰后重建表征（重建相关性=0.978；单次初始化失败；n=30）；(iv)该机制在表征化的(K,p)包络内运行（n=350）；(v)记录×初始化是最小关键二元组（n=40）；(vi)在匹配周转率条件下，DM优于非梯度基线（Hopfield网络、ESN）（n=370）。这些结果确立了DM作为非梯度认知系统中持久记忆的可证伪机制，其功能与海马巩固相似。