The explosively growing communication traffic in datacenters imposes increasingly stringent performance requirements on the underlying networks. Over the last years, researchers have developed innovative optical switching technologies that enable reconfigurable datacenter networks (RCDNs) which support very fast topology reconfigurations. This paper presents D3, a novel and feasible RDCN architecture that improves throughput and flow completion time. D3 quickly and jointly adapts its links and packet scheduling toward the evolving demand, combining both demand-oblivious and demand-aware behaviors when needed. D3 relies on a decentralized network control plane supporting greedy, integrated-multihop, IP-based routing, allowing to react, quickly and locally, to topological changes without overheads. A rack-local synchronization and transport layer further support fast network adjustments. Moreover, we argue that D3 can be implemented using the recently proposed Sirius architecture (SIGCOMM 2020). We report on an extensive empirical evaluation using packet-level simulations. We find that D3 improves throughput by up to 15% and preserves competitive flow completion times compared to the state of the art. We further provide an analytical explanation of the superiority of D3, introducing an extension of the well-known Birkhoff-von Neumann decomposition, which may be of independent interest.
翻译:数据中心通信流量的爆炸式增长对底层网络性能提出了日益严格的要求。近年来,研究人员开发了创新的光交换技术,实现了支持极快速拓扑重构的可重构数据中心网络(RCDN)。本文提出D3——一种新颖且可行的RCDN架构,能够提升吞吐量并改善流完成时间。D3能根据动态变化的需求,快速协同调整其链路与数据包调度机制,在必要时结合无需求感知与需求感知两种行为模式。D3采用支持基于IP的贪婪式集成多跳路由的分布式网络控制平面,可快速、局部地响应拓扑变化而无需额外开销。机架本地同步与传输层进一步支持网络的快速调整。此外,我们论证了D3可采用近期提出的Sirius架构(SIGCOMM 2020)实现。通过数据包级仿真的广泛实证评估表明:相较于现有最优方案,D3可将吞吐量提升高达15%,同时保持具有竞争力的流完成时间。我们进一步通过理论分析阐释了D3的优越性,提出了经典伯克霍夫-冯·诺依曼分解的扩展形式,该理论拓展可能具有独立学术价值。