Large multi-agent systems such as real-time strategy games are often driven by collective behavior of agents. For example, in StarCraft II, human players group spatially near agents into a team and control the team to defeat opponents. In this light, clustering the agents in the game has been used for various purposes such as the efficient control of the agents in multi-agent reinforcement learning and game analytic tools for the game users. However, despite the useful information provided by clustering, learning the dynamics of multi-agent systems at a cluster level has been rarely studied yet. In this paper, we present a hybrid AI model that couples unsupervised and self-supervised learning to forecast evolution of the clusters in StarCraft II. We develop an unsupervised Hebbian learning method in a set-to-cluster module to efficiently create a variable number of the clusters with lower inference time complexity than K-means clustering. Also, a long short-term memory based prediction module is designed to recursively forecast state vectors generated by the set-to-cluster module to define cluster configuration. We experimentally demonstrate the proposed model successfully predicts complex movement of the clusters in the game.
翻译:大型多智能体系统(如即时战略游戏)通常由智能体的集体行为驱动。例如,在《星际争霸II》中,人类玩家会将空间邻近的智能体编组成团队,并控制团队击败对手。基于此,游戏中的智能体聚类已被用于多种用途,例如多智能体强化学习中的高效智能体控制,以及面向游戏用户的游戏分析工具。然而,尽管聚类能提供有用信息,但针对多智能体系统在集群层面的动态学习研究仍较为罕见。本文提出一种结合无监督学习与自监督学习的混合AI模型,用于预测《星际争霸II》中集群的演化过程。我们开发了一种基于赫布学习的无监督方法,在集合-聚类模块中高效生成可变数量的集群,其推理时间复杂度低于K-means聚类。此外,设计了一种基于长短期记忆网络的预测模块,通过递归预测集合-聚类模块生成的状态向量来定义集群配置。实验表明,所提模型能够成功预测游戏中集群的复杂运动轨迹。