High-radix, low-diameter networks like HyperX and Dragonfly use a Full-mesh core, and rely on multiple virtual channels (VCs) to avoid packet deadlocks in adaptive routing. However, VCs introduce significant overhead in the switch in terms of area, power, and design complexity, limiting the switch scalability. This paper starts by revisiting VC-less routing through link ordering schemes in Full-mesh networks, which offer implementation simplicity but suffer from performance degradation under adversarial traffic. Thus, to overcome these challenges, we propose TERA (Topology-Embedded Routing Algorithm), a novel routing algorithm which employs an embedded physical subnetwork to provide deadlock-free non-minimal paths without using VCs. In a Full-mesh network, TERA outperforms link ordering routing algorithms by 80% when dealing with adversarial traffic, and up to 100% in application kernels. Furthermore, compared to other VC-based approaches, it reduces buffer requirements by 50%, while maintaining comparable latency and throughput. Lastly, early results from a 2D-HyperX evaluation show that TERA outperforms state-of-the-art algorithms that use the same number of VCs, achieving performance improvements of up to 32%.
翻译:HyperX和Dragonfly等高基数、低直径网络采用全网状核心拓扑,并依赖多个虚拟通道(VC)在自适应路由中避免数据包死锁。然而,虚拟通道在交换机面积、功耗和设计复杂度方面引入了显著开销,限制了交换机的可扩展性。本文首先重新审视全网状网络中通过链路排序方案实现的无虚拟通道路由,该方案实现简单但在对抗性流量下存在性能下降问题。为克服这些挑战,我们提出TERA(拓扑嵌入路由算法),这是一种新型路由算法,通过嵌入物理子网络提供无需虚拟通道的无死锁非最短路径。在全网状网络中,TERA处理对抗性流量时性能较链路排序路由算法提升80%,在应用内核中最高提升100%。此外,与其他基于虚拟通道的方案相比,TERA在保持可比延迟和吞吐量的同时,将缓冲区需求降低50%。最后,二维HyperX网络的初步评估结果表明,TERA在使用相同数量虚拟通道的情况下,优于现有先进算法,最高可实现32%的性能提升。