We explore an error-bounded lossy compression approach for reducing scientific data associated with 2D/3D unstructured meshes. While existing lossy compressors offer a high compression ratio with bounded error for regular grid data, methodologies tailored for unstructured mesh data are lacking; for example, one can compress nodal data as 1D arrays, neglecting the spatial coherency of the mesh nodes. Inspired by the SZ compressor, which predicts and quantizes values in a multidimensional array, we dynamically reorganize nodal data into sequences. Each sequence starts with a seed cell; based on a predefined traversal order, the next cell is added to the sequence if the current cell can predict and quantize the nodal data in the next cell with the given error bound. As a result, one can efficiently compress the quantized nodal data in each sequence until all mesh nodes are traversed. This paper also introduces a suite of novel error metrics, namely continuous mean squared error (CMSE) and continuous peak signal-to-noise ratio (CPSNR), to assess compression results for unstructured mesh data. The continuous error metrics are defined by integrating the error function on all cells, providing objective statistics across nonuniformly distributed nodes/cells in the mesh. We evaluate our methods with several scientific simulations ranging from ocean-climate models and computational fluid dynamics simulations with both traditional and continuous error metrics. We demonstrated superior compression ratios and quality than existing lossy compressors.
翻译:摘要:本文探索了一种适用于二维/三维无结构网格科学数据的误差有界有损压缩方法。现有有损压缩器虽能为规则网格数据提供高压缩比与误差有界性,但缺乏针对无结构网格数据的定制方案——例如,将节点数据压缩为一维数组会忽略网格节点的空间相干性。受多维数组预测量化压缩器SZ的启发,我们提出动态重组节点数据为序列的方法:每个序列始于一个种子单元,基于预定义遍历顺序,若当前单元能以给定误差界预测并量化下一单元的节点数据,则将该下一单元加入序列。由此,可高效压缩每个序列中的量化节点数据,直至所有网格节点遍历完成。本文还引入了一套新型误差指标——连续均方误差(CMSE)与连续峰值信噪比(CPSNR),用于评估无结构网格数据的压缩质量。连续误差指标通过对所有单元进行误差函数积分定义,为网格中非均匀分布的节点/单元提供客观统计量。我们采用传统与连续两种误差指标,针对海洋-气候模型及计算流体动力学模拟等多个科学仿真场景评估了所提方法。结果表明,与现有有损压缩器相比,本方法在压缩比与质量上均具有显著优势。