5G and beyond networks need to provide dynamic and efficient infrastructure management to better adapt to time-varying user behaviors (e.g., user mobility, interference, user traffic and evolution of the network topology). In this paper, we propose to manage the trajectory of Mobile Access Points (MAPs) under all these dynamic constraints with reduced complexity. We first formulate the placement problem to manage MAPs over time. Our solution addresses time-varying user traffic and user mobility through a Multi-Agent Deep Reinforcement Learning (MADRL). To achieve real-time behavior, the proposed solution learns to perform distributed assignment of MAP-user positions and schedules the MAP path among all users without centralized user's clustering feedback. Our solution exploits a dual-attention MADRL model via proximal policy optimization to dynamically move MAPs in 3D. The dual-attention takes into account information from both users and MAPs. The cooperation mechanism of our solution allows to manage different scenarios, without a priory information and without re-training, which significantly reduces complexity.
翻译:5G及未来网络需要提供动态且高效的基础设施管理,以更好地适应时变用户行为(如用户移动性、干扰、用户流量及网络拓扑演化)。本文提出在所有这些动态约束下以降低的复杂度管理移动接入点(MAP)的轨迹。我们首先将随时间管理MAP的布局问题公式化。该方案通过多智能体深度强化学习(MADRL)处理时变用户流量与用户移动性。为达到实时行为,所提方案学习执行MAP-用户位置的分布式分配,并在无集中式用户聚类反馈的情况下调度所有用户间的MAP路径。该方案利用基于近端策略优化的双注意力MADRL模型,使MAP在三维空间中动态移动。双注意力机制同时考虑用户与MAP的信息。该方案的协作机制无需先验信息且无需重新训练即可管理不同场景,从而显著降低复杂度。