The Open Radio Access Network (O-RAN) architecture empowers intelligent and automated optimization of the RAN through applications deployed on the RAN Intelligent Controller (RIC) platform, enabling capabilities beyond what is achievable with traditional RAN solutions. Within this paradigm, Traffic Steering (TS) emerges as a pivotal RIC application that focuses on optimizing cell-level mobility settings in near-real-time, aiming to significantly improve network spectral efficiency. In this paper, we design a novel TS algorithm based on a Cascade Reinforcement Learning (CaRL) framework. We propose state space factorization and policy decomposition to reduce the need for large models and well-labeled datasets. For each sub-state space, an RL sub-policy will be trained to learn an optimized mapping onto the action space. To apply CaRL on new network regions, we propose a knowledge transfer approach to initialize a new sub-policy based on knowledge learned by the trained policies. To evaluate CaRL, we build a data-driven and scalable RIC digital twin (DT) that is modeled using important real-world data, including network configuration, user geo-distribution, and traffic demand, among others, from a tier-1 mobile operator in the US. We evaluate CaRL on two DT scenarios representing two network clusters in two different cities and compare its performance with the business-as-usual (BAU) policy and other competing optimization approaches using heuristic and Q-table algorithms. Benchmarking results show that CaRL performs the best and improves the average cluster-aggregated downlink throughput over the BAU policy by 24% and 18% in these two scenarios, respectively.
翻译:开放无线接入网络(O-RAN)架构通过部署在无线接入网络智能控制器(RIC)平台上的应用程序,实现了对RAN的智能化和自动化优化,其能力超越了传统RAN解决方案所能达到的水平。在此范式中,流量引导(TS)作为一种关键的RIC应用应运而生,它专注于以近实时方式优化小区级移动性设置,旨在显著提升网络频谱效率。本文设计了一种基于级联强化学习(CaRL)框架的新型TS算法。我们提出状态空间分解与策略分解方法,以减少对大模型和高质量标注数据集的依赖。针对每个子状态空间,将训练一个RL子策略以学习其到动作空间的优化映射。为将CaRL应用于新的网络区域,我们提出一种知识迁移方法,基于已训练策略习得的知识来初始化新的子策略。为评估CaRL,我们构建了一个数据驱动且可扩展的RIC数字孪生(DT)系统,该系统采用来自美国一家一级移动运营商的重要真实数据进行建模,包括网络配置、用户地理分布和流量需求等。我们在代表两个不同城市网络集群的两种DT场景中评估CaRL,并将其性能与常规运营(BAU)策略以及使用启发式和Q表算法的其他竞争性优化方案进行对比。基准测试结果表明,CaRL在两个场景中均表现最佳,相比BAU策略将平均集群聚合下行链路吞吐量分别提升了24%和18%。