Effective operation and seamless cooperation of robotic systems are a fundamental component of next-generation technologies and applications. In contexts such as disaster response, swarm operations require coordinated behavior and mobility control to be handled in a distributed manner, with the quality of the agents' actions heavily relying on the communication between them and the underlying network. In this paper, we formulate the problem of dynamic network bridging in a novel Decentralized Partially Observable Markov Decision Process (Dec-POMDP), where a swarm of agents cooperates to form a link between two distant moving targets. Furthermore, we propose a Multi-Agent Reinforcement Learning (MARL) approach for the problem based on Graph Convolutional Reinforcement Learning (DGN) which naturally applies to the networked, distributed nature of the task. The proposed method is evaluated in a simulated environment and compared to a centralized heuristic baseline showing promising results. Moreover, a further step in the direction of sim-to-real transfer is presented, by additionally evaluating the proposed approach in a near Live Virtual Constructive (LVC) UAV framework.
翻译:机器人系统的有效运行与无缝协作是下一代技术与应用的基础组成部分。在灾害响应等场景中,蜂群操作需要以分布式方式处理协调行为与移动控制,而智能体动作质量高度依赖于彼此间的通信及底层网络。本文提出了一种新型去中心化部分可观测马尔可夫决策过程(Dec-POMDP)下的动态网络桥接问题,其中蜂群通过多智能体协同在远距离移动目标间建立链路。进一步地,我们提出了一种基于图卷积强化学习(DGN)的多智能体强化学习(MARL)方法,该方法天然适用于任务的网络化、分布式特性。所提方法在仿真环境中进行了评估,并与集中式启发式基线进行了对比,展现了有前景的结果。此外,通过在近真实活件-虚拟-构造(LVC)无人机框架下额外评估所提方法,本文在仿真到现实迁移方向迈出了进一步的一步。