We explore an error-bounded lossy compression approach for reducing scientific data associated with 2D/3D unstructured meshes. While existing lossy compressors offer a high compression ratio with bounded error for regular grid data, methodologies tailored for unstructured mesh data are lacking; for example, one can compress nodal data as 1D arrays, neglecting the spatial coherency of the mesh nodes. Inspired by the SZ compressor, which predicts and quantizes values in a multidimensional array, we dynamically reorganize nodal data into sequences. Each sequence starts with a seed cell; based on a predefined traversal order, the next cell is added to the sequence if the current cell can predict and quantize the nodal data in the next cell with the given error bound. As a result, one can efficiently compress the quantized nodal data in each sequence until all mesh nodes are traversed. This paper also introduces a suite of novel error metrics, namely continuous mean squared error (CMSE) and continuous peak signal-to-noise ratio (CPSNR), to assess compression results for unstructured mesh data. The continuous error metrics are defined by integrating the error function on all cells, providing objective statistics across nonuniformly distributed nodes/cells in the mesh. We evaluate our methods with several scientific simulations ranging from ocean-climate models and computational fluid dynamics simulations with both traditional and continuous error metrics. We demonstrated superior compression ratios and quality than existing lossy compressors.
翻译:我们探索了一种用于减少与二维/三维非结构化网格相关的科学数据的有界误差有损压缩方法。虽然现有有损压缩器可为规则网格数据提供高压缩比与有界误差,但针对非结构化网格数据的方法仍显不足——例如,将节点数据作为一维数组进行压缩会忽视网格节点的空间连贯性。受SZ压缩器(通过预测与量化多维数组中的数值)的启发,我们动态地将节点数据重组为序列。每个序列始于一个种子单元;基于预定义的遍历顺序,若当前单元能在给定误差界内预测并量化下一单元的节点数据,则将该单元加入序列。由此,可高效压缩每个序列中的量化节点数据,直至遍历所有网格节点。本文还引入了一套新型误差度量指标——连续均方误差(CMSE)与连续峰值信噪比(CPSNR),用于评估非结构化网格数据的压缩效果。连续误差度量通过集成所有单元上的误差函数进行定义,从而为网格中非均匀分布的节点/单元提供客观统计量。我们采用传统误差度量与连续误差度量,在多个科学模拟场景(包括海洋气候模型与计算流体动力学模拟)中对所提方法进行评估。结果表明,与现有有损压缩器相比,本方法在压缩比与压缩质量方面均具有显著优势。