We consider the problem of distilling efficient network topologies for collective communications. We provide an algorithmic framework for constructing direct-connect topologies optimized for the node latency vs bandwidth trade-off given a collective communication workload. Our algorithmic framework allows us to start from small base topologies and associated communication schedules and use a set of techniques that can be iteratively applied to derive much larger topologies. The schedules for these derived topologies are either synthesized along with the expansions or computed using an optimization formulation. Our approach allows us to synthesize many different topologies and schedules for a given cluster size and degree, and then identify the appropriate topology and schedule for a given workload. We evaluate our approach on a 12-node optical testbed that uses patch panels for configuring the desired topology and augment it with an analytical-model-based evaluation for larger deployments. We show that the derived topologies and schedules provide significant performance benefits over existing approaches.
翻译:我们研究了为集体通信提炼高效网络拓扑的问题。我们提供了一个算法框架,用于构建针对节点延迟与带宽权衡进行优化的直接连接拓扑,以应对给定的集体通信工作负载。该算法框架使我们能够从小型基础拓扑及其关联的通信调度出发,利用一组可迭代应用的技术推导出更大的拓扑。这些衍生拓扑的调度要么随扩展过程同步合成,要么通过优化公式计算得出。我们的方法能够为给定集群规模和度数合成多种不同的拓扑与调度,进而为特定工作负载确定合适的拓扑与调度。我们在一个12节点光学测试平台上评估了该方法,该平台使用配线架配置所需拓扑,并结合基于分析模型的方法对更大规模部署进行了评估。结果表明,与现有方法相比,推导出的拓扑与调度在性能上具有显著优势。