The rapid growth of scientific data is surpassing advancements in computing, creating challenges in storage, transfer, and analysis, particularly at the exascale. While data reduction techniques such as lossless and lossy compression help mitigate these issues, their computational overhead introduces new bottlenecks. GPU-accelerated approaches improve performance but face challenges in portability, memory transfer, and scalability on multi-GPU systems. To address these, we propose HPDR, a high-performance, portable data reduction framework. HPDR supports diverse processor architectures, reducing memory transfer overhead to 2.3% and achieving up to 3.5x faster throughput than existing solutions. It attains 96% of the theoretical speedup in multi-GPU settings. Evaluations on the Frontier supercomputer demonstrate 103 TB/s throughput and up to 4x acceleration in parallel I/O performance at scale. HPDR offers a scalable, efficient solution for managing massive data volumes in exascale computing environments.
翻译:科学数据的快速增长正超越计算能力的进步,给存储、传输和分析带来挑战,在百亿亿次计算规模下尤为突出。虽然无损和有损压缩等数据缩减技术有助于缓解这些问题,但其计算开销会引入新的瓶颈。GPU加速方法虽能提升性能,但在多GPU系统上仍面临可移植性、内存传输和可扩展性方面的挑战。为此,我们提出HPDR——一个高性能、可移植的数据缩减框架。HPDR支持多种处理器架构,能将内存传输开销降低至2.3%,并实现比现有方案高达3.5倍的吞吐量。在多GPU配置中,其性能可达理论加速比的96%。在Frontier超级计算机上的评估表明,该框架可实现103 TB/s的吞吐量,并在大规模并行I/O性能中实现高达4倍的加速。HPDR为百亿亿次计算环境下海量数据的管理提供了可扩展的高效解决方案。