Constructing a consistent shared spatial memory is a critical challenge in multi-agent systems, where partial observability and limited bandwidth often lead to catastrophic failures in coordination. We introduce a multi-agent predictive coding framework that formulates coordination as the minimization of mutual uncertainty among agents. Through an information bottleneck objective, this framework prompts agents to learn not only who and what to communicate but also when. At the foundation of this framework lies a grid-cell-like metric as internal spatial coding for self-localization, emerging spontaneously from self-supervised motion prediction. Building upon this internal spatial code, agents gradually develop a bandwidth-efficient communication mechanism and specialized neural populations that encode partners' locations-an artificial analogue of hippocampal social place cells (SPCs). These social representations are further utilized by a hierarchical reinforcement learning policy that actively explores to reduce joint uncertainty. On the Memory-Maze benchmark, our approach shows exceptional resilience to bandwidth constraints: success degrades gracefully from 73.5% to 64.4% as bandwidth shrinks from 128 to 4 bits/step, whereas a full-broadcast baseline collapses from 67.6% to 28.6%. Our findings establish a theoretically principled and biologically plausible basis for how complex social representations emerge from a unified predictive drive, leading to collective intelligence.
翻译:构建一致的共享空间记忆是多智能体系统中的关键挑战,部分可观测性与有限带宽常导致协作灾难性失败。我们提出一种多智能体预测编码框架,将协作问题形式化为智能体间互不确定性的最小化。通过信息瓶颈目标函数,该框架促使智能体不仅学会沟通对象与内容,还学会沟通时机。该框架的基础是一种类网格细胞度量——作为自监督运动预测中自发产生的内部空间编码,用于智能体自定位。基于此内部空间编码,智能体逐步发展出带宽高效的通信机制以及编码同伴位置的特化神经群体——海马社会位置细胞的人工对应物。这些社会表征进一步被分层强化学习策略利用,通过主动探索降低联合不确定性。在Memory-Maze基准测试中,我们的方法展现出对带宽约束的卓越鲁棒性:当带宽从128比特/步降至4比特/步时,成功率从73.5%优雅衰减至64.4%,而全广播基准从67.6%骤降至28.6%。本研究为统一预测驱动力如何产生复杂社会表征并最终形成集体智能提供了理论严谨且具有生物合理性的基础。