Social intelligence manifests the capability, often referred to as the Theory of Mind (ToM), to discern others' behavioral intentions, beliefs, and other mental states. ToM is especially important in multi-agent and human-machine interaction environments because each agent needs to understand the mental states of other agents in order to better respond, interact, and collaborate. Recent research indicates that the ToM model possesses the capability to infer beliefs, intentions, and anticipate future observations and actions; nonetheless, its deployment in tackling intricate tasks remains notably limited. The challenges arise when the number of agents increases, the environment becomes more complex, and interacting with the environment and predicting the mental state of each other becomes difficult and time consuming. To overcome such limits, we take inspiration from the Theory of Collective Mind (ToCM) mechanism, predicting observations of all other agents into a unified but plural representation and discerning how our own actions affect this mental state representation. Based on this foundation, we construct an imaginative space to simulate the multi-agent interaction process, thus improving the efficiency of cooperation among multiple agents in complex decision-making environments. In various cooperative tasks with different numbers of agents, the experimental results highlight the superior cooperative efficiency and performance of our approach compared to the Multi-Agent Reinforcement Learning (MARL) baselines. We achieve consistent boost on SNN- and DNN-based decision networks, and demonstrate that ToCM's inferences about others' mental states can be transferred to new tasks for quickly and flexible adaptation.
翻译:社会智能体现了一种能力,通常被称为心智理论(ToM),即辨别他人的行为意图、信念及其他心理状态。在多智能体及人机交互环境中,ToM尤为重要,因为每个智能体都需要理解其他智能体的心理状态,以便更好地响应、交互和协作。近期研究表明,ToM模型具备推断信念、意图以及预测未来观察和行动的能力;然而,其在处理复杂任务中的应用仍然非常有限。挑战在于,当智能体数量增加、环境变得更复杂时,与环境交互并相互预测心理状态变得困难且耗时。为克服这些限制,我们从集体心智理论(ToCM)机制中汲取灵感,将其他所有智能体的观察预测统一为多元表征,并识别自身行动如何影响这一心理状态表征。基于此,我们构建了一个想象空间来模拟多智能体交互过程,从而提升复杂决策环境中多智能体的协作效率。在涉及不同数量智能体的多种协作任务中,实验结果表明,我们的方法在协作效率和性能上均优于多智能体强化学习(MARL)基线。我们在基于SNN和DNN的决策网络上实现了一致的性能提升,并证明ToCM对他者心理状态的推断可迁移至新任务,以实现快速灵活的自适应。