Many common types of data can be represented as functions that map coordinates to signal values, such as pixel locations to RGB values in the case of an image. Based on this view, data can be compressed by overfitting a compact neural network to its functional representation and then encoding the network weights. However, most current solutions for this are inefficient, as quantization to low-bit precision substantially degrades the reconstruction quality. To address this issue, we propose overfitting variational Bayesian neural networks to the data and compressing an approximate posterior weight sample using relative entropy coding instead of quantizing and entropy coding it. This strategy enables direct optimization of the rate-distortion performance by minimizing the $\beta$-ELBO, and target different rate-distortion trade-offs for a given network architecture by adjusting $\beta$. Moreover, we introduce an iterative algorithm for learning prior weight distributions and employ a progressive refinement process for the variational posterior that significantly enhances performance. Experiments show that our method achieves strong performance on image and audio compression while retaining simplicity.
翻译:许多常见类型的数据可以表示为将坐标映射到信号值的函数,例如图像中像素位置到RGB值的映射。基于这一观点,可以通过将紧凑型神经网络过拟合到数据的函数表示,然后对网络权重进行编码来实现数据压缩。然而,当前大多数解决方案效率较低,因为量化到低位宽会显著降低重建质量。为了解决这一问题,我们提出对数据过拟合变分贝叶斯神经网络,并利用相对熵编码(而非量化和熵编码)来压缩近似的后验权重样本。该策略通过最小化$\beta$-ELBO直接优化率失真性能,并通过调整$\beta$针对给定的网络架构实现不同的率失真权衡。此外,我们引入了一种迭代算法来学习先验权重分布,并对变分后验采用渐进式精化过程,从而显著提升性能。实验表明,我们的方法在图像和音频压缩上取得了强劲性能,同时保持了简单性。