Virtual Network Embedding (VNE) is a key enabler of network slicing, yet most formulations assume that each Virtual Network Request (VNR) has a fixed topology. Recently, VNE with Alternative topologies (VNEAP) was introduced to capture malleable VNRs, where each request can be instantiated using one of several functionally equivalent topologies that trade resources differently. While this flexibility enlarges the feasible space, it also introduces an additional decision layer, making dynamic embedding more challenging. This paper proposes HRL-VNEAP, a hierarchical reinforcement learning approach for VNEAP under dynamic arrivals. A high-level policy selects the most suitable alternative topology (or rejects the request), and a low-level policy embeds the chosen topology onto the substrate network. Experiments on realistic substrate topologies under multiple traffic loads show that naive exploitation strategies provide only modest gains, whereas HRL-VNEAP consistently achieves the best performance across all metrics. Compared to the strongest tested baselines, HRL-VNEAP improves acceptance ratio by up to \textbf{20.7\%}, total revenue by up to \textbf{36.2\%}, and revenue-over-cost by up to \textbf{22.1\%}. Finally, we benchmark against an MILP formulation on tractable instances to quantify the remaining gap to optimality and motivate future work on learning- and optimization-based VNEAP solutions.
翻译:虚拟网络嵌入(VNE)是网络切片的关键支撑技术,但大多数现有研究假设每个虚拟网络请求(VNR)具有固定拓扑结构。近年来,引入的含替代拓扑的虚拟网络嵌入(VNEAP)可处理可塑VNR,即每个请求可通过若干功能等价但资源消耗不同的拓扑之一进行实例化。这种灵活性虽扩大了可行解空间,但也引入了额外决策层级,使得动态嵌入更具挑战性。本文提出HRL-VNEAP——一种面向动态到达场景下VNEAP问题的分层强化学习方法。高层策略负责选择最合适的替代拓扑(或拒绝请求),低层策略则将选定拓扑嵌入底层网络。在多种流量负载下的真实底层拓扑实验表明:简单利用策略仅能带来有限收益,而HRL-VNEAP在所有指标上均持续实现最优性能。与最强基线方法相比,HRL-VNEAP将接纳率提升高达\textbf{20.7\%},总收益提升\textbf{36.2\%},收益成本比提升\textbf{22.1\%}。最后,我们在可解实例上以MILP公式为基准进行对标,量化与最优解的剩余差距,为基于学习与优化的VNEAP解决方案的未来研究提供方向。