In this paper, we propose an rApp, named SliceMapper, to optimize the mapping process of the open centralized unit (O-CU) and open distributed unit (O-DU) of an open radio access network (O-RAN) slice subnet onto the underlying open cloud (O-Cloud) sites in sixth-generation (6G) O-RAN. To accomplish this, we first design a system model for SliceMapper and introduce its mathematical framework. Next, we formulate the mapping process addressed by SliceMapper as a sequential decision-making optimization problem. To solve this problem, we implement both on-policy and off-policy variants of the Q-learning algorithm, employing tabular representation as well as function approximation methods for each variant. To evaluate the effectiveness of these approaches, we conduct a series of simulations under various scenarios. We proceed further by performing a comparative analysis of all four variants. The results demonstrate that the on-policy function approximation method outperforms the alternative approaches in terms of stability and lower standard deviation across all random seeds. However, the on-policy and off-policy tabular representation methods achieve higher average rewards, with values of 5.42 and 5.12, respectively. Finally, we conclude the paper and introduce several directions for future research.
翻译:本文提出一种名为SliceMapper的rApp,用于优化第六代(6G)开放无线接入网(O-RAN)中切片子网的开放集中单元(O-CU)与开放分布单元(O-DU)在底层开放云(O-Cloud)站点上的映射过程。为此,我们首先为SliceMapper设计了系统模型并引入其数学框架。接着,将SliceMapper处理的映射过程建模为序列决策优化问题。为解决该问题,我们实现了Q-learning算法的同策略与异策略两种变体,并为每种变体分别采用表格表示法和函数逼近法。为评估这些方法的有效性,我们在多种场景下进行了一系列仿真实验。进一步地,我们对全部四种变体进行了对比分析。结果表明:同策略函数逼近法在所有随机种子下均表现出更优的稳定性与更低的标准差。然而,同策略与异策略表格表示法获得了更高的平均奖励值,分别为5.42和5.12。最后,我们对全文进行总结并提出了若干未来研究方向。