Various congestion control protocols have been designed to achieve high performance in different network environments. Modern online learning solutions that delegate the congestion control actions to a machine cannot properly converge in the stringent time scales of data centers. We leverage multiagent reinforcement learning to design a system for dynamic tuning of congestion control parameters at end-hosts in a data center. The system includes agents at the end-hosts to monitor and report the network and traffic states, and agents to run the reinforcement learning algorithm given the states. Based on the state of the environment, the system generates congestion control parameters that optimize network performance metrics such as throughput and latency. As a case study, we examine BBR, an example of a prominent recently-developed congestion control protocol. Our experiments demonstrate that the proposed system has the potential to mitigate the problems of static parameters.
翻译:多种拥塞控制协议被设计用于在不同网络环境中实现高性能。现代在线学习方案将拥塞控制操作委托给机器,但无法在数据中心的严格时间尺度上正确收敛。我们利用多智能体强化学习设计了一个系统,用于动态调整数据中心终端主机的拥塞控制参数。该系统包含终端主机上的智能体,用于监控和报告网络及流量状态,以及根据状态运行强化学习算法的智能体。基于环境状态,系统生成优化网络性能指标(如吞吐量和延迟)的拥塞控制参数。作为案例研究,我们考察了BBR——一种近期开发的重要拥塞控制协议。实验表明,所提系统有潜力缓解静态参数导致的问题。