To fulfill the low latency requirements of today's applications, deployment of RDMA in datacenters has become prevalent over the recent years. However, the in-order delivery requirement of RDMAs prevents them from leveraging powerful techniques that help improve the performance of datacenters, ranging from fine-grained load balancers to throughput-optimal expander topologies. We demonstrate experimentally that these techniques significantly deteriorate the performance in an RDMA network because they induce packet reordering. Furthermore, lifting the in-order delivery constraint enhances the flexibility of RDMA networks and enables them to employ these performance-enhancing techniques. To realize this, we propose an ordering layer, Eunomia, to equip RDMA NICs to handle packet reordering. Eunomia employs a hybrid-dynamic bitmap structure that efficiently uses the limited on-chip memory with the help of a customized memory controller and handles high degrees of packet reordering. We evaluate the feasibility of Eunomia through an FPGA-based implementation and its performance through large-scale simulations. We show that Eunomia enables a wide range of applications in RDMA datacenter networks, such as fine-grained load balancers which improve performance by reducing average flow completion times by 85% and 52% compared to ECMP and Conweave, respectively, or employment of RDMA in expander topologies like Jellyfish which allows up to 60% lower flow completion times and higher throughput gains compared to Fat tree.
翻译:为满足现代应用的低延迟需求,近年来RDMA在数据中心中的部署日益普遍。然而,RDMA的有序交付要求阻碍了其利用多种可提升数据中心性能的关键技术,包括细粒度负载均衡器和吞吐量最优的扩展器拓扑结构。我们通过实验证明,由于这些技术会引发数据包乱序,它们会显著降低RDMA网络的性能。此外,解除有序交付约束可增强RDMA网络的灵活性,使其能够采用这些性能提升技术。为实现这一目标,我们提出一种有序层Eunomia,使RDMA网卡能够处理数据包乱序。Eunomia采用混合动态位图结构,通过定制内存控制器高效利用有限的片上内存,并能处理高度数据包乱序。我们通过基于FPGA的实现验证了Eunomia的可行性,并通过大规模仿真评估其性能。研究表明,Eunomia能够在RDMA数据中心网络中实现多种应用,例如细粒度负载均衡器——与ECMP和Conweave相比,其平均流完成时间分别降低85%和52%;又如将RDMA应用于Jellyfish等扩展器拓扑结构——与Fat tree相比,其流完成时间最多可降低60%,并实现更高的吞吐量增益。