5G and beyond networks need to provide dynamic and efficient infrastructure management to better adapt to time-varying user behaviors (e.g., user mobility, interference, user traffic and evolution of the network topology). In this paper, we propose to manage the trajectory of Mobile Access Points (MAPs) under all these dynamic constraints with reduced complexity. We first formulate the placement problem to manage MAPs over time. Our solution addresses time-varying user traffic and user mobility through a Multi-Agent Deep Reinforcement Learning (MADRL). To achieve real-time behavior, the proposed solution learns to perform distributed assignment of MAP-user positions and schedules the MAP path among all users without centralized user's clustering feedback. Our solution exploits a dual-attention MADRL model via proximal policy optimization to dynamically move MAPs in 3D. The dual-attention takes into account information from both users and MAPs. The cooperation mechanism of our solution allows to manage different scenarios, without a priory information and without re-training, which significantly reduces complexity.
翻译:5G及未来网络需要提供动态且高效的基础设施管理,以更好地适应时变的用户行为(如用户移动性、干扰、用户流量及网络拓扑演化)。本文提出了一种在所有这些动态约束下以较低复杂度管理移动接入点(MAP)轨迹的方案。我们首先构建了随时间管理MAP的部署问题。所提方案通过多智能体深度强化学习(MADRL)应对时变用户流量与用户移动性。为实现实时行为,该方案学习执行MAP-用户位置的分布式分配,并在无需集中式用户聚类反馈的情况下规划所有用户间的MAP路径。我们的方案利用基于近端策略优化的双注意力MADRL模型,在三维空间中动态移动MAP。双注意力机制同时考虑来自用户和MAP的信息。该方案的合作机制无需先验信息且无需重新训练即可管理不同场景,显著降低了复杂度。