Large scientific collaborations often share huge volumes of data around the world. Consequently a significant amount of network bandwidth is needed for data replication and data access. Users in the same region may possibly share resources as well as data, especially when they are working on related topics with similar datasets. In this work, we study the network traffic patterns and resource utilization for scientific data caches connecting European networks to the US. We explore the efficiency of resource utilization, especially for network traffic which consists mostly of transatlantic data transfers, and the potential for having more caching node deployments. Our study shows that these data caches reduced network traffic volume by 97% during the study period. This demonstrates that such caching nodes are effective in reducing wide-area network traffic.
翻译:大型科学合作项目通常在全球范围内共享海量数据,因此需要大量网络带宽用于数据复制和访问。同一地区的用户可能共享资源及数据,尤其是在处理相关课题和相似数据集时。本研究分析了连接欧洲与美国网络的科学数据缓存的网络流量模式与资源利用情况。我们探讨了资源利用效率,特别是以跨大西洋数据传输为主的网络流量,以及部署更多缓存节点的可行性。研究表明,在研究期间,这些数据缓存将网络流量减少了97%,验证了此类缓存节点在降低广域网流量方面的有效性。