Weather and climate simulations produce petabytes of high-resolution data that are later analyzed by researchers in order to understand climate change or severe weather. We propose a new method of compressing this multidimensional weather and climate data: a coordinate-based neural network is trained to overfit the data, and the resulting parameters are taken as a compact representation of the original grid-based data. While compression ratios range from 300x to more than 3,000x, our method outperforms the state-of-the-art compressor SZ3 in terms of weighted RMSE, MAE. It can faithfully preserve important large scale atmosphere structures and does not introduce artifacts. When using the resulting neural network as a 790x compressed dataloader to train the WeatherBench forecasting model, its RMSE increases by less than 2%. The three orders of magnitude compression democratizes access to high-resolution climate data and enables numerous new research directions.
翻译:天气与气候模拟会产生PB级高分辨率数据,研究人员随后需分析这些数据以理解气候变化或极端天气现象。本文提出一种新的多维气象与气候数据压缩方法:通过训练基于坐标的神经网络对数据进行过拟合,将训练所得的参数作为原始网格数据的紧凑表征。在实现300倍至3000倍以上压缩比的同时,本方法在加权均方根误差(RMSE)和平均绝对误差(MAE)指标上均优于当前最先进的SZ3压缩器。该方法能忠实保留重要的大尺度大气结构特征,且不会引入伪影。当将该神经网络作为790倍压缩的数据加载器用于训练WeatherBench天气预报模型时,其RMSE增幅不足2%。这种跨越三个数量级的压缩技术降低了高分辨率气候数据的获取门槛,为众多新兴研究方向开辟了道路。