Lossy compression is one of the most effective methods for reducing the size of scientific data containing multiple data fields. It reduces information density through prediction or transformation techniques to compress the data. Previous approaches use local information from a single target field when predicting target data points, limiting their potential to achieve higher compression ratios. In this paper, we identified significant cross-field correlations within scientific datasets. We propose a novel hybrid prediction model that utilizes CNN to extract cross-field information and combine it with existing local field information. Our solution enhances the prediction accuracy of lossy compressors, leading to improved compression ratios without compromising data quality. We evaluate our solution on three scientific datasets, demonstrating its ability to improve compression ratios by up to 25% under specific error bounds. Additionally, our solution preserves more data details and reduces artifacts compared to baseline approaches.
翻译:损失压缩是缩减包含多数据场的科学数据规模最有效的方法之一。它通过预测或变换技术降低信息密度以实现数据压缩。现有方法在预测目标数据点时仅利用单一目标场的局部信息,限制了其实现更高压缩比的潜力。本文在科学数据集中识别出显著的跨场相关性。我们提出一种新颖的混合预测模型,该模型利用CNN提取跨场信息,并将其与现有局部场信息相结合。我们的方案提升了损失压缩器的预测精度,从而在不影响数据质量的前提下提高压缩比。我们在三个科学数据集上评估了该方案,证明其在特定误差界限下可将压缩比提升高达25%。此外,与基线方法相比,我们的方案能保留更多数据细节并减少伪影。