Multi-Agent Reinforcement Learning with Common Policy for Antenna Tilt Optimization

from arxiv, 7 pages and 13 figures, submitted to IAENG International Journal of Computer Science for publication consideration. The paper has been accepted with minor changes. This is the latest submitted version

This paper presents a method for optimizing wireless networks by adjusting cell parameters that affect both the performance of the cell being optimized and the surrounding cells. The method uses multiple reinforcement learning agents that share a common policy and take into account information from neighboring cells to determine the state and reward. In order to avoid impairing network performance during the initial stages of learning, agents are pre-trained in an earlier phase of offline learning. During this phase, an initial policy is obtained using feedback from a static network simulator and considering a wide variety of scenarios. Finally, agents can intelligently tune the cell parameters of a test network by suggesting small incremental changes, slowly guiding the network toward an optimal configuration. The agents propose optimal changes using the experience gained with the simulator in the pre-training phase, but they can also continue to learn from current network readings after each change. The results show how the proposed approach significantly improves the performance gains already provided by expert system-based methods when applied to remote antenna tilt optimization. The significant gains of this approach have truly been observed when compared with a similar method in which the state and reward do not incorporate information from neighboring cells.

翻译：本文提出了一种通过调整小区参数来优化无线网络的方法，这些参数既影响被优化小区的性能，也影响周围小区的性能。该方法使用多个共享共同策略的强化学习智能体，并考虑相邻小区的信息以确定状态和奖励。为了避免在学习初始阶段损害网络性能，智能体在离线学习的早期阶段进行预训练。在此阶段，通过静态网络模拟器的反馈并在考虑多种场景的情况下获得初始策略。最终，智能体能够通过建议小幅增量变化智能地调整测试网络的小区参数，逐步将网络引导至最优配置。智能体利用预训练阶段从模拟器中获得的经验提出最优变化，同时能在每次变化后继续从当前网络读数中学习。结果表明，与基于专家系统的方法相比，所提方法在远程天线倾角优化中显著提升了性能增益。当与一种状态和奖励未纳入相邻小区信息的类似方法对比时，本方法的显著优势得到了充分验证。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

43+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日