A Comparative Analysis of Deep Reinforcement Learning-based xApps in O-RAN

The highly heterogeneous ecosystem of Next Generation (NextG) wireless communication systems calls for novel networking paradigms where functionalities and operations can be dynamically and optimally reconfigured in real time to adapt to changing traffic conditions and satisfy stringent and diverse Quality of Service (QoS) demands. Open Radio Access Network (RAN) technologies, and specifically those being standardized by the O-RAN Alliance, make it possible to integrate network intelligence into the once monolithic RAN via intelligent applications, namely, xApps and rApps. These applications enable flexible control of the network resources and functionalities, network management, and orchestration through data-driven control loops. Despite recent work demonstrating the effectiveness of Deep Reinforcement Learning (DRL) in controlling O-RAN systems, how to design these solutions in a way that does not create conflicts and unfair resource allocation policies is still an open challenge. In this paper, we perform a comparative analysis where we dissect the impact of different DRL-based xApp designs on network performance. Specifically, we benchmark 12 different xApps that embed DRL agents trained using different reward functions, with different action spaces and with the ability to hierarchically control different network parameters. We prototype and evaluate these xApps on Colosseum, the world's largest O-RAN-compliant wireless network emulator with hardware-in-the-loop. We share the lessons learned and discuss our experimental results, which demonstrate how certain design choices deliver the highest performance while others might result in a competitive behavior between different classes of traffic with similar objectives.

翻译：下一代无线通信系统的高度异构生态系统要求采用新型网络范式，使功能和操作能够实时动态优化重构，以适应变化的流量状况并满足严苛多样的服务质量需求。开放式无线接入网技术（特别是O-RAN联盟正在标准化的技术）通过智能应用（即xApp和rApp）将网络智能集成到曾经单一封闭的RAN中。这些应用通过数据驱动的控制环路实现网络资源与功能的灵活控制、网络管理及编排。尽管近期研究已证明深度强化学习在控制O-RAN系统中的有效性，但如何设计这些解决方案以避免冲突和不公平的资源分配策略仍是开放性挑战。本文通过比较分析，剖析不同DRL-based xApp设计对网络性能的影响。具体而言，我们对12种嵌入DRL智能体的xApp进行基准测试，这些智能体采用不同奖励函数训练、具有不同动作空间，并能分层控制不同网络参数。我们在全球最大的支持O-RAN且包含硬件在环的无线网络仿真器Colosseum上对上述xApp进行原型实现与评估。我们分享经验教训并讨论实验结果，这些结果揭示了某些设计选择如何实现最高性能，而另一些设计如何导致具有相似目标的不同流量类别之间产生竞争行为。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日