We propose MADP, a novel diffusion-model-based approach for collaboration in decentralized robot swarms. MADP leverages diffusion models to generate samples from complex and high-dimensional action distributions that capture the interdependencies between agents' actions. Each robot conditions policy sampling on a fused representation of its own observations and perceptual embeddings received from peers. To evaluate this approach, we task a team of holonomic robots piloted by MADP to address coverage control-a canonical multi agent navigation problem. The policy is trained via imitation learning from a clairvoyant expert on the coverage control problem, with the diffusion process parameterized by a spatial transformer architecture to enable decentralized inference. We evaluate the system under varying numbers, locations, and variances of importance density functions, capturing the robustness demands of real-world coverage tasks. Experiments demonstrate that our model inherits valuable properties from diffusion models, generalizing across agent densities and environments, and consistently outperforming state-of-the-art baselines.
翻译:我们提出MADP,一种基于扩散模型的新型协作方法,用于分散式机器人群体。MADP利用扩散模型从复杂高维动作分布中生成样本,这些样本捕捉了智能体动作之间的相互依赖关系。每个机器人根据自身观测和从同伴接收的感知嵌入的融合表示来调节策略采样。为评估该方法,我们让由MADP操控的全向机器人团队执行覆盖控制——一个经典的多智能体导航问题。该策略通过从覆盖控制问题的全知专家进行模仿学习来训练,扩散过程由空间变换器架构参数化以实现分散式推理。我们在重要性密度函数的不同数量、位置和方差下评估系统,捕捉真实世界覆盖任务的鲁棒性需求。实验表明,我们的模型继承了扩散模型的有价值特性,能够泛化到不同智能体密度和环境,并持续优于最先进的基线方法。