The pursuit of high-performance data transfer often focuses on raw network bandwidth, where international links of 100 Gbps or higher are frequently considered the primary enabler. While necessary, this network-centric view is incomplete, as it equates provisioned link speeds with practical, sustainable data movement capabilities across the entire edge-to-core spectrum. This paper investigates six common paradigms, ranging from network latency and TCP congestion control to host-side factors such as CPU performance and virtualization that critically impact data movement workflows. These paradigms represent widely adopted engineering assumptions that inform system design, procurement decisions, and operational practices in production data movement environments. We introduce the "Drainage Basin Pattern" conceptual model for reasoning about end-to-end data flow constraints across heterogeneous hardware and software components to address the fidelity gap between raw bandwidth and application-level throughput. Our findings are validated through rigorous production-scale deployments, including U.S. DOE ESnet technical evaluations and transcontinental production trials over 100 Gbps operational links. The results demonstrate that principal bottlenecks often reside outside the network core, and that a holistic hardware-software co-design enables consistent, predictable performance for moving data at scale and speed.
翻译:高性能数据传输的追求往往聚焦于原始网络带宽,其中100 Gbps或更高的国际链路常被视为主要赋能因素。尽管必要,这种以网络为中心的视角并不完整,因为它将配置的链路速度等同于整个边缘到核心频谱中实际、可持续的数据传输能力。本文研究了六种常见范式,从网络延迟和TCP拥塞控制到主机端因素(如CPU性能和虚拟化),这些因素对数据传输工作流产生关键影响。这些范式代表了广泛采用的工程假设,指导着生产数据传输环境中的系统设计、采购决策和操作实践。我们引入了“流域模式”概念模型,用于推理跨异构硬件和软件组件的端到端数据流约束,以解决原始带宽与应用级吞吐量之间的保真度差距。我们的发现通过严格的生产规模部署得到验证,包括美国能源部ESnet技术评估和基于100 Gbps运营链路的跨大陆生产试验。结果表明,主要瓶颈通常位于网络核心之外,而硬件-软件协同设计的整体方法能够为大规模高速数据传输提供一致、可预测的性能。