Phasor Agents: Oscillatory Graphs with Three-Factor Plasticity and Sleep-Staged Learning

Phasor Agents are dynamical systems whose internal state is a Phasor Graph: a weighted graph of coupled Stuart-Landau oscillators. A Stuart-Landau oscillator is a minimal stable "rhythm generator" (the normal form near a Hopf bifurcation); each oscillator is treated as an abstract computational unit (inspired by, but not claiming to model, biological oscillatory populations). In this interpretation, oscillator phase tracks relative timing (coherence), while amplitude tracks local gain or activity. Relative phase structure serves as a representational medium; coupling weights are learned via three-factor local plasticity - eligibility traces gated by sparse global modulators and oscillation-timed write windows - without backpropagation. A central challenge in oscillatory substrates is stability: online weight updates can drive the network into unwanted regimes (e.g., global synchrony), collapsing representational diversity. We therefore separate wake tagging from offline consolidation, inspired by synaptic tagging-and-capture and sleep-stage dynamics: deep-sleep-like gated capture commits tagged changes safely, while REM-like replay reconstructs and perturbs experience for planning. A staged experiment suite validates each mechanism with ablations and falsifiers: eligibility traces preserve credit under delayed modulation; compression-progress signals pass timestamp-shuffle controls; phase-coherent retrieval reaches 4x diffusive baselines under noise; wake/sleep separation expands stable learning by 67 percent under matched weight-norm budgets; REM replay improves maze success rate by +45.5 percentage points; and a Tolman-style latent-learning signature - immediate competence and detour advantage after unrewarded exploration, consistent with an internal model - emerges from replay (Tolman, 1948). The codebase and all artifacts are open-source.

翻译：相位智能体是一种动态系统，其内部状态为相位图：由耦合的斯图尔特-朗道振荡器构成的加权图。斯图尔特-朗道振荡器是一种最小化的稳定“节律生成器”（霍普夫分岔附近的范式）；每个振荡器被视为抽象计算单元（受生物振荡群体启发，但并非旨在对其建模）。在此解释中，振荡器相位追踪相对时序（相干性），而振幅追踪局部增益或活动度。相对相位结构作为表征媒介；耦合权重通过三因子局部可塑性进行学习——即由稀疏全局调制器和振荡定时写入窗口门控的资格迹——无需反向传播。振荡基底的核心挑战在于稳定性：在线权重更新可能将网络驱动至非期望状态（例如全局同步），导致表征多样性崩溃。因此，我们受突触标记-捕获机制和睡眠阶段动力学的启发，将清醒期标记与离线巩固分离：类深度睡眠的门控捕获安全地提交标记变更，而类快速眼动期的回放则重构并扰动经验以进行规划。阶段性实验套件通过消融实验和证伪测试验证了各机制：资格迹在延迟调制下保持信用分配；压缩进展信号通过时间戳重排控制检验；相位相干检索在噪声条件下达到扩散基线的4倍性能；在匹配权重范数预算下，清醒/睡眠分离将稳定学习范围扩展了67%；快速眼动回放使迷宫成功率提升45.5个百分点；托尔曼式潜在学习特征——即无奖励探索后立即获得能力并展现绕行优势（与内部模型一致）——从回放中涌现（Tolman, 1948）。代码库及所有实验构件均已开源。