In modern communication systems, efficient and reliable information dissemination is crucial for supporting critical operations across domains like disaster response, autonomous vehicles, and sensor networks. This paper introduces a Multi-Agent Reinforcement Learning (MARL) approach as a significant step forward in achieving more decentralized, efficient, and collaborative solutions. We propose a Decentralized-POMDP formulation for information dissemination, empowering each agent to independently decide on message forwarding. This constitutes a significant paradigm shift from traditional heuristics based on Multi-Point Relay (MPR) selection. Our approach harnesses Graph Convolutional Reinforcement Learning, employing Graph Attention Networks (GAT) with dynamic attention to capture essential network features. We propose two approaches, L-DGN and HL-DGN, which differ in the information that is exchanged among agents. We evaluate the performance of our decentralized approaches, by comparing them with a widely-used MPR heuristic, and we show that our trained policies are able to efficiently cover the network while bypassing the MPR set selection process. Our approach promises a first step toward bolstering the resilience of real-world broadcast communication infrastructures via learned, collaborative information dissemination.
翻译:在现代通信系统中,高效可靠的信息传播对于支持灾难响应、自动驾驶车辆和传感器网络等领域的关键操作至关重要。本文提出了一种多智能体强化学习(MARL)方法,作为实现更去中心化、高效和协作解决方案的重要一步。我们提出了一个用于信息传播的去中心化部分可观测马尔可夫决策过程(Dec-POMDP)框架,使每个智能体能够独立决定消息转发。这构成了从基于多点中继(MPR)选择的传统启发式方法的重大范式转变。我们的方法利用图卷积强化学习,采用具有动态注意力的图注意力网络(GAT)来捕获关键网络特征。我们提出了两种方法:L-DGN和HL-DGN,它们的不同之处在于智能体之间交换的信息。通过将我们的去中心化方法与广泛使用的MPR启发式方法进行比较,我们评估了其性能,并展示了我们训练得到的策略能够高效覆盖网络,同时绕过MPR集选择过程。我们的方法标志着通过学习的协作信息传播来增强真实世界广播通信基础设施韧性的第一步。