High-performance Effective Scientific Error-bounded Lossy Compression with Auto-tuned Multi-component Interpolation

Error-bounded lossy compression has been identified as a promising solution for significantly reducing scientific data volumes upon users' requirements on data distortion. For the existing scientific error-bounded lossy compressors, some of them (such as SPERR and FAZ) can reach fairly high compression ratios and some others (such as SZx, SZ, and ZFP) feature high compression speeds, but they rarely exhibit both high ratio and high speed meanwhile. In this paper, we propose HPEZ with newly-designed interpolations and quality-metric-driven auto-tuning, which features significantly improved compression quality upon the existing high-performance compressors, meanwhile being exceedingly faster than high-ratio compressors. The key contributions lie in the following points: (1) We develop a series of advanced techniques such as interpolation re-ordering, multi-dimensional interpolation, and natural cubic splines to significantly improve compression qualities with interpolation-based data prediction. (2) The auto-tuning module in HPEZ has been carefully designed with novel strategies, including but not limited to block-wise interpolation tuning, dynamic dimension freezing, and Lorenzo tuning. (3) We thoroughly evaluate HPEZ compared with many other compressors on six real-world scientific datasets. Experiments show that HPEZ outperforms other high-performance error-bounded lossy compressors in compression ratio by up to 140% under the same error bound, and by up to 360% under the same PSNR. In parallel data transfer experiments on the distributed database, HPEZ achieves a significant performance gain with up to 40% time cost reduction over the second-best compressor.

翻译：误差有界无损压缩已被视为一种有前景的解决方案，能够根据用户对数据失真的需求大幅减少科学数据体积。对于现有的科学误差有界无损压缩器，其中一些（如SPERR和FAZ）可实现相当高的压缩比，而另一些（如SZx、SZ和ZFP）则具有高压缩速度，但它们很少同时兼具高压缩比和高速度。在本文中，我们提出HPEZ，其采用新设计的插值方法和基于质量度量的自动调优，在现有高性能压缩器的基础上显著提升压缩质量，同时速度远超高压缩比压缩器。关键贡献在于以下几点：（1）我们开发了一系列先进技术，如插值重排序、多维插值和自然三次样条，以通过基于插值的数据预测显著提升压缩质量。（2）HPEZ中的自动调优模块经过精心设计，采用了新颖策略，包括但不限于块级插值调优、动态维度冻结和Lorenzo调优。（3）我们在六个真实世界科学数据集上对HPEZ与许多其他压缩器进行了全面评估。实验表明，在相同误差界限下，HPEZ的压缩比优于其他高性能误差有界无损压缩器最高达140%；在相同PSNR下，压缩比优于其他最高达360%。在分布式数据库的并行数据传输实验中，HPEZ相比性能第二的压缩器实现了显著性能提升，时间成本最高降低40%。