With the rapid growth of machine learning (ML) workloads in datacenters, existing congestion control (CC) algorithms fail to deliver the required performance at scale. ML traffic is bursty and bulk-synchronous and thus requires quick reaction and strong fairness. We show that existing CC algorithms that use delay as a main signal react too slowly and are not always fair. We design SMaRTT, a simple sender-based CC algorithm that combines delay, ECN, and optional packet trimming for fast and precise window adjustments. At the core of SMaRTT lies the novel QuickAdapt algorithm that accurately estimates the bandwidth at the receiver. We show how to combine SMaRTT with a new per-packet traffic load-balancing algorithm called REPS to effectively reroute packets around congested hotspots as well as flaky or failing links. Our evaluation shows that SMaRTT alone outperforms EQDS, Swift, BBR, and MPRDMA by up to 50% on modern datacenter networks.
翻译:随着数据中心机器学习工作负载的快速增长,现有拥塞控制算法无法在规模化条件下提供所需性能。机器学习流量具有突发性和批量同步特性,因此需要快速响应和强公平性。研究表明,以延迟作为主要信号的现有拥塞控制算法响应速度过慢且无法保证公平性。我们设计了SMaRTT,一种简单的基于发送端的拥塞控制算法,该算法通过结合延迟、显式拥塞通知和可选的数据包修剪机制实现快速精准的窗口调整。SMaRTT的核心创新在于QuickAdapt算法,该算法能够精确估算接收端带宽。我们进一步展示了如何将SMaRTT与新型逐数据包流量负载均衡算法REPS相结合,有效绕开拥塞热点及不稳定或故障链路进行数据包重路由。评估结果表明,在现代数据中心网络中,仅使用SMaRTT即可比EQDS、Swift、BBR和MPRDMA算法性能提升高达50%。