Many extreme-scale applications require the movement of large quantities of data to, from, and among leadership computing facilities, as well as other scientific facilities and the home institutions of facility users. These applications, particularly when leadership computing facilities are involved, can touch upon edge cases (e.g., terabyte files) that had not been a focus of previous Globus optimization work, which had emphasized rather the movement of many smaller (megabyte to gigabyte) files. We report here on how automated client-driven chunking can be used to accelerate both the movement of large files and the integrity checking operations that have proven to be essential for large data transfers. We present detailed performance studies that provide insights into the benefits of these modifications in a range of file transfer scenarios.
翻译:许多极端规模应用需要在领先计算设施、其他科学设施以及设施用户的所属机构之间传输海量数据。这些应用,特别是涉及领先计算设施时,可能会触及先前Globus优化工作中未重点关注的边缘情况(例如太字节级文件),该工作此前更侧重于大量较小(兆字节至千兆字节)文件的传输。本文报告了如何利用自动化客户端驱动的分块技术,来加速大文件传输以及已被证明对大规模数据传输至关重要的完整性校验操作。我们通过详尽的性能研究,展示了这些改进在多种文件传输场景中的优势。