In partially observable multi-agent systems, agents typically only have access to local observations. This severely hinders their ability to make precise decisions, particularly during decentralized execution. To alleviate this problem and inspired by image outpainting, we propose State Inference with Diffusion Models (SIDIFF), which uses diffusion models to reconstruct the original global state based solely on local observations. SIDIFF consists of a state generator and a state extractor, which allow agents to choose suitable actions by considering both the reconstructed global state and local observations. In addition, SIDIFF can be effortlessly incorporated into current multi-agent reinforcement learning algorithms to improve their performance. Finally, we evaluated SIDIFF on different experimental platforms, including Multi-Agent Battle City (MABC), a novel and flexible multi-agent reinforcement learning environment we developed. SIDIFF achieved desirable results and outperformed other popular algorithms.
翻译:在部分可观测的多智能体系统中,智能体通常仅能获取局部观测信息。这严重限制了其做出精确决策的能力,尤其在分散式执行过程中。为缓解此问题并受图像外绘技术的启发,我们提出了基于扩散模型的状态推断方法(SIDIFF),该方法利用扩散模型仅基于局部观测重构原始全局状态。SIDIFF由状态生成器和状态提取器组成,使智能体能够同时考虑重构的全局状态与局部观测来选择合适动作。此外,SIDIFF可无缝集成到现有多智能体强化学习算法中以提升其性能。最后,我们在多个实验平台上评估了SIDIFF,包括我们自主开发的新型灵活多智能体强化学习环境——多智能体战斗城市(MABC)。SIDIFF取得了理想的结果,并优于其他主流算法。