Error-bounded lossy compression has been a critical technique to significantly reduce the sheer amounts of simulation datasets for high-performance computing (HPC) scientific applications while effectively controlling the data distortion based on user-specified error bound. In many real-world use cases, users must perform computational operations on the compressed data (a.k.a. homomorphic compression). However, none of the existing error-bounded lossy compressors support the homomorphism, inevitably resulting in undesired decompression costs. In this paper, we propose a novel homomorphic error-bounded lossy compressor (called HoSZp), which supports not only error-bounding features but efficient computations (including negation, addition, multiplication, mean, variance, etc.) on the compressed data without the complete decompression step, which is the first attempt to the best of our knowledge. We develop several optimization strategies to maximize the overall compression ratio and execution performance. We evaluate HoSZp compared to other state-of-the-art lossy compressors based on multiple real-world scientific application datasets.
翻译:有界误差有损压缩已成为高性能计算(HPC)科学应用中的一项关键技术,它能在基于用户指定误差界有效控制数据失真的同时,显著减少海量模拟数据集。在许多实际应用场景中,用户必须对压缩后的数据执行计算操作(即同态压缩)。然而,现有的有界误差有损压缩器均不支持同态性,这不可避免地导致了不期望的解压缩开销。本文提出了一种新颖的同态有界误差有损压缩器(称为HoSZp),它不仅支持误差有界特性,而且支持在无需完全解压缩步骤的情况下对压缩数据执行高效计算(包括取反、加法、乘法、均值、方差等),据我们所知,这是该领域的首次尝试。我们开发了多种优化策略以最大化整体压缩比和执行性能。基于多个真实世界科学应用数据集,我们将HoSZp与其他先进的有损压缩器进行了对比评估。