Internet of Things (IoT) technologies have enabled numerous data-driven mobile applications and have the potential to significantly improve environmental monitoring and hazard warnings through the deployment of a network of IoT sensors. However, these IoT devices are often power-constrained and utilize wireless communication schemes with limited bandwidth. Such power constraints limit the amount of information each device can share across the network, while bandwidth limitations hinder sensors' coordination of their transmissions. In this work, we formulate the communication planning problem of IoT sensors that track the state of the environment. We seek to optimize sensors' decisions in collecting environmental data under stringent resource constraints. We propose a multi-agent reinforcement learning (MARL) method to find the optimal communication policies for each sensor that maximize the tracking accuracy subject to the power and bandwidth limitations. MARL learns and exploits the spatial-temporal correlation of the environmental data at each sensor's location to reduce the redundant reports from the sensors. Experiments on wildfire spread with LoRA wireless network simulators show that our MARL method can learn to balance the need to collect enough data to predict wildfire spread with unknown bandwidth limitations.
翻译:物联网技术已催生出大量数据驱动的移动应用,并通过部署物联网传感器网络,有望显著改善环境监测与灾害预警能力。然而,这些物联网设备通常面临能源受限问题,且采用带宽有限的无线通信方案。能耗限制制约了每个设备在网络中可共享的信息量,而带宽限制则阻碍了传感器之间的传输协调。在本研究中,我们针对追踪环境状态的物联网传感器,提出了通信规划问题。我们旨在优化传感器在严格资源约束下收集环境数据的决策。我们提出了一种多智能体强化学习方法,用于为每个传感器寻找最优通信策略,从而在功耗和带宽限制下最大化追踪精度。该方法通过学习和利用各传感器位置处环境数据的时空相关性,减少传感器间的冗余报告。基于LoRA无线网络模拟器的野火蔓延实验表明,我们的多智能体强化学习方法能够在未知带宽限制条件下,学习平衡收集足够数据以预测野火蔓延的需求。