Preserving Discrete Morse-Smale Complexes in Error-Bounded Lossy Compression

Scientific applications are generating unprecedented volumes of data that overwhelm storage and transmission systems, posing significant challenges for the design of data management tools and scientific databases. Lossy compression has emerged as a promising strategy to address this problem, but most existing compressors fail to preserve the topology of scientific data, leading to inaccuracies in downstream analyses and potentially erroneous scientific conclusions. In this work, we present a methodology for fully preserving the topology, specifically, Morse-Smale complexes (MSCs), in lossy-compressed 2D and 3D scalar field data from scientific simulations. We generalize the edit-based strategy introduced in MSz (a previous method that preserves only segmentations and cannot preserve saddles or separatrices) by extending the framework to the full MSCs, including all critical points and separatrices. Our approach corrects the MSCs in the decompressed output of any error-bounded lossy compressor (e.g., SZ3 or ZFP), referred to as the base compressor, using an iterative editing strategy that preserves all critical points and their connectivity via separatrices. During compression, we generate a sequence of quantized edits that are applied to the decompressed output, ensuring accurate preservation of topological features while maintaining the error within prescribed bounds. The strategy iteratively fixes critical points and separatrices in alternating steps until convergence is achieved in a finite number of iterations. To meet diverse application needs, our method offers flexible options that balance compression efficiency with feature preservation. To reduce computation time, we leverage GPU parallelism to accelerate each component of the workflow. Experiments on multiple datasets demonstrate that our method achieves 100% preservation of Morse-Smale complexes.

翻译：科学应用正生成前所未有的海量数据，这些数据压垮了存储与传输系统，给数据管理工具与科学数据库的设计带来重大挑战。有损压缩已成为应对这一问题的有前途策略，但现有压缩器大多未能保留科学数据的拓扑结构，导致下游分析出现偏差，甚至可能产生错误的科学结论。本文提出一种方法，可在对科学模拟生成的二维和三维标量场数据进行有损压缩时，完整保留其拓扑结构，具体而言，即Morse-Smale复形(MSC)。我们推广了MSz方法（该先前方法仅保留分割结果，无法保留鞍点或分离线）中引入的基于编辑的策略，将框架扩展至完整的MSC，涵盖所有临界点与分离线。该方法以任意保误差有损压缩器（如SZ3或ZFP，简称基础压缩器）的解压输出为基础，通过迭代编辑策略修正其中的MSC，从而保留所有临界点及其通过分离线的连通性。在压缩过程中，我们生成一系列量化编辑操作，将其应用于解压输出，确保在误差不超过规定界限的前提下精确保留拓扑特征。该策略交替迭代修正临界点与分离线，直到在有限迭代次数内收敛为止。为满足不同应用需求，方法提供灵活选项，可在压缩效率与特征保留间取得平衡。为降低计算时间，我们利用GPU并行加速工作流各组件。多数据集实验表明，本方法可实现Morse-Smale复形的100%保留。