The proliferation of Internet of Things (IoT) networks has created an urgent need for sustainable energy solutions, particularly for the battery-constrained spatially distributed IoT nodes. While low-altitude uncrewed aerial vehicles (UAVs) employed with wireless power transfer (WPT) capabilities offer a promising solution, the line-of-sight channels that facilitate efficient energy delivery also expose sensitive operational data to adversaries. This paper proposes a novel low-altitude UAV-carried movable antenna-enhanced transmission system joint WPT and covert communications, which simultaneously performs energy supplements to IoT nodes and establishes transmission links with a covert user by leveraging wireless energy signals as a natural cover. Then, we formulate a multi-objective optimization problem that jointly maximizes the total harvested energy of IoT nodes and sum achievable rate of the covert user, while minimizing the propulsion energy consumption of the low-altitude UAV. To address the non-convex and temporally coupled optimization problem, we propose a mixture-of-experts-augmented soft actor-critic (MoE-SAC) algorithm that employs a sparse Top-K gated mixture-of-shallow-experts architecture to represent multimodal policy distributions arising from the conflicting optimization objectives. We also incorporate an action projection module that explicitly enforces per-time-slot power budget constraints and antenna position constraints. Simulation results demonstrate that the proposed approach significantly outperforms some baseline approaches and other state-of-the-art deep reinforcement learning algorithms.
翻译:物联网(IoT)网络的激增对可持续能源解决方案提出了迫切需求,尤其是针对电池受限的空间分布式物联网节点。尽管配备无线能量传输(WPT)能力的低空无人驾驶飞行器(UAV)提供了一种有前景的解决方案,但促进高效能量传输的视距信道也使得敏感操作数据暴露给潜在攻击者。本文提出了一种新型低空无人机载可移动天线增强传输系统,用于联合无线能量传输与隐蔽通信,该系统利用无线能量信号作为天然掩护,同时为物联网节点补充能量并与隐蔽用户建立传输链路。随后,我们构建了一个多目标优化问题,旨在联合最大化物联网节点的总收集能量和隐蔽用户的可实现总速率,同时最小化低空无人机的推进能量消耗。为解决这一非凸且时间耦合的优化问题,我们提出了一种专家混合增强的软演员-评论家(MoE-SAC)算法,该算法采用稀疏Top-K门控的浅层专家混合架构,以表征由冲突优化目标产生的多模态策略分布。我们还引入了一个动作投影模块,显式地强制执行每时隙功率预算约束和天线位置约束。仿真结果表明,所提出的方法显著优于若干基线方法及其他先进的深度强化学习算法。