Compared with IP multicast, Overlay Multicast (OM) offers better compatibility and flexible deployment in heterogeneous, cross-domain networks. However, traditional OM struggles to adapt to dynamic traffic due to unawareness of physical resource states, and existing reinforcement learning methods fail to decouple OM's tightly coupled multi-objective nature, leading to high complexity, slow convergence, and instability. To address this, we propose MA-DHRL-OM, a multi-agent deep hierarchical reinforcement learning approach. Using SDN's global view, it builds a traffic-aware model for OM path planning. The method decomposes OM tree construction into two stages via hierarchical agents, reducing action space and improving convergence stability. Multi-agent collaboration balances multi-objective optimization while enhancing scalability and adaptability. Experiments show MA-DHRL-OM outperforms existing methods in delay, bandwidth utilization, and packet loss, with more stable convergence and flexible routing.
翻译:相较于IP组播,重叠组播(OM)在异构跨域网络中具有更好的兼容性和灵活部署优势。然而,传统OM因无法感知物理资源状态而难以适应动态流量,现有强化学习方法难以解耦OM紧密耦合的多目标特性,导致高复杂度、慢收敛和不稳定性问题。为此,我们提出MA-DHRL-OM——一种多智能体深度分层强化学习方法。该方法利用SDN全局视图构建OM路径规划的流量感知模型,通过分层智能体将OM树构建分解为两个阶段,降低动作空间并提升收敛稳定性。多智能体协同机制在平衡多目标优化的同时增强了可扩展性与自适应性。实验表明,MA-DHRL-OM在时延、带宽利用率和丢包率方面均优于现有方法,具有更稳定的收敛特性和更灵活的路由能力。