Reinforcement Learning (RL) has gained significant momentum in the development of network protocols. However, RL-based protocols are still in their infancy, and substantial research is required to build deployable solutions. Developing a protocol based on RL is a complex and challenging process that involves several model design decisions and requires significant training and evaluation in real and simulated network topologies. Network simulators offer an efficient training environment for RL-based protocols, because they are deterministic and can run in parallel. In this paper, we introduce \textit{RayNet}, a scalable and adaptable simulation platform for the development of RL-based network protocols. RayNet integrates OMNeT++, a fully programmable network simulator, with Ray/RLlib, a scalable training platform for distributed RL. RayNet facilitates the methodical development of RL-based network protocols so that researchers can focus on the problem at hand and not on implementation details of the learning aspect of their research. We developed a simple RL-based congestion control approach as a proof of concept showcasing that RayNet can be a valuable platform for RL-based research in computer networks, enabling scalable training and evaluation. We compared RayNet with \textit{ns3-gym}, a platform with similar objectives to RayNet, and showed that RayNet performs better in terms of how fast agents can collect experience in RL environments.
翻译:强化学习(Reinforcement Learning, RL)在网络协议开发中取得了显著进展。然而,基于RL的协议仍处于初期阶段,构建可部署的解决方案尚需大量研究。基于RL开发协议是一个复杂且充满挑战的过程,涉及多项模型设计决策,并需在真实网络拓扑和仿真网络拓扑中进行大量训练与评估。网络仿真器为基于RL的协议提供了高效训练环境,因其具有确定性且可并行运行。本文介绍了\textit{RayNet},一个可扩展且适应性强的仿真平台,专用于开发基于RL的网络协议。RayNet将完全可编程的网络仿真器OMNeT++与分布式RL的可扩展训练平台Ray/RLlib相融合。RayNet促使基于RL的网络协议开发过程系统化,使研究人员能够聚焦于核心问题,而非研究学习中实现细节。我们设计了一个基于RL的简单拥塞控制方法作为概念验证,展示了RayNet可成为计算机网络中基于RL研究的宝贵平台,支持可扩展的训练与评估。我们将RayNet与目标相似的平台\textit{ns3-gym}进行了比较,结果表明RayNet在智能体于RL环境中收集经验的速度方面表现更优。