Optimal resource allocation in modern communication networks calls for the optimization of objective functions that are only accessible via costly separate evaluations for each candidate solution. The conventional approach carries out the optimization of resource-allocation parameters for each system configuration, characterized, e.g., by topology and traffic statistics, using global search methods such as Bayesian optimization (BO). These methods tend to require a large number of iterations, and hence a large number of key performance indicator (KPI) evaluations. In this paper, we propose the use of meta-learning to transfer knowledge from data collected from related, but distinct, configurations in order to speed up optimization on new network configurations. Specifically, we combine meta-learning with BO, as well as with multi-armed bandit (MAB) optimization, with the latter having the potential advantage of operating directly on a discrete search space. Furthermore, we introduce novel contextual meta-BO and meta-MAB algorithms, in which transfer of knowledge across configurations occurs at the level of a mapping from graph-based contextual information to resource-allocation parameters. Experiments for the problem of open loop power control (OLPC) parameter optimization for the uplink of multi-cell multi-antenna systems provide insights into the potential benefits of meta-learning and contextual optimization.
翻译:现代通信网络中的最优资源分配需要针对每个候选解通过代价高昂的独立评估才能访问的目标函数进行优化。传统方法对每种系统配置(如拓扑结构和流量统计特征)采用贝叶斯优化等全局搜索方法进行资源分配参数优化,这类方法通常需要大量迭代次数,进而需要大量关键绩效指标评估。本文提出利用元学习从相关但不同配置的数据中迁移知识,以加速新网络配置上的优化过程。具体而言,我们将元学习与贝叶斯优化以及多臂老虎机优化相结合,后者具有直接在离散搜索空间上操作的潜在优势。此外,我们引入新颖的上下文元贝叶斯优化和上下文元多臂老虎机算法,这类算法通过基于图的上下文信息到资源分配参数的映射层级实现跨配置知识迁移。针对多小区多天线系统上行链路开环功率控制参数优化问题的实验,揭示了元学习与上下文优化的潜在优势。