The rapid expansion of computational capabilities and the ever-growing scale of modern HPC systems present formidable challenges in managing exascale scientific data. Faced with such vast datasets, traditional lossless compression techniques prove insufficient in reducing data size to a manageable level while preserving all information intact. In response, researchers have turned to error-bounded lossy compression methods, which offer a balance between data size reduction and information retention. However, despite their utility, these compressors employing conventional techniques struggle with limited reconstruction quality. To address this issue, we draw inspiration from recent advancements in deep learning and propose GWLZ, a novel group-wise learning-based lossy compression framework with multiple lightweight learnable enhancer models. Leveraging a group of neural networks, GWLZ significantly enhances the decompressed data reconstruction quality with negligible impact on the compression efficiency. Experimental results on different fields from the Nyx dataset demonstrate remarkable improvements by GWLZ, achieving up to 20% quality enhancements with negligible overhead as low as 0.0003x.
翻译:计算能力的快速扩展与现代高性能计算系统规模的持续增长,给海量科学数据的管理带来了严峻挑战。面对如此庞大的数据集,传统无损压缩技术在保留所有信息的同时难以将数据体积缩减至可控水平。为此,研究人员转向误差有界有损压缩方法,这类方法在数据体积缩减与信息保留之间实现了平衡。然而,尽管这些采用传统技术的压缩器具有一定实用性,但其重建质量仍受限于有限的重构能力。针对这一问题,我们从深度学习的最新进展中汲取灵感,提出了一种名为GWLZ的新型分组学习有损压缩框架,该框架集成了多个轻量级可学习增强模型。借助一组神经网络,GWLZ能够显著提升解压数据的重构质量,同时对压缩效率的影响微乎其微。在Nyx数据集不同场上的实验结果表明,GWLZ实现了显著改进——最高可提升20%的质量,而额外开销低至0.0003倍。