Large scientific collaborations often share huge volumes of data around the world. Consequently a significant amount of network bandwidth is needed for data replication and data access. Users in the same region may possibly share resources as well as data, especially when they are working on related topics with similar datasets. In this work, we study the network traffic patterns and resource utilization for scientific data caches connecting European networks to the US. We explore the efficiency of resource utilization, especially for network traffic which consists mostly of transatlantic data transfers, and the potential for having more caching node deployments. Our study shows that these data caches reduced network traffic volume by 97% during the study period. This demonstrates that such caching nodes are effective in reducing wide-area network traffic.
翻译:大型科学合作项目经常在全球范围内共享海量数据。因此,数据复制和访问需要消耗大量的网络带宽。同一地区的用户可能共享资源及数据,尤其是在他们使用相似数据集从事相关课题研究时。本文研究了连接欧洲网络与美国网络的科学数据缓存的网络流量模式及资源利用情况。我们探讨了资源利用效率(尤其是主要由跨大西洋数据传输构成的网络流量)以及部署更多缓存节点的潜力。研究表明,在研究期间,这些数据缓存使网络流量减少了97%。这表明此类缓存节点在减少广域网流量方面是有效的。